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1  Problem  Statement 


The  objective  of  this  research  grant  is  to  develop  a  new  mathematical  formalism  for  cooper¬ 
ative  mnltiagent  system  synthesis  that  is  explicitly  designed  to  accommodate  sophisticated 
social  relationships  such  as  negotiation  and  compromise.  To  do  so,  we  focus  on  the  founda¬ 
tional  assumptions  that  undergird  multiagent  decision  making,  and  challenge  the  adequacy  of 
the  classical  assumptions  for  the  design  of  socially  sophisticated  artificial  multiagent  systems. 

Classical  approaches  to  multiagent  decision  making,  such  as  von  Neumann-Morgenstern 
game  theory  [19]  and  social  choice  theory  [1,4, 10],  are  founded  upon  two  key  assumptions. 

•  It  is  assumed  that  each  member  of  a  multiagent  system  possesses  a  well-defined  total 
preference  ordering  over  all  of  the  feasible  actions  of  the  collective.  Such  preference 
orderings  are  categorical  in  the  sense  that  they  are  unconditional  —  once  defined,  the 
preference  orderings  are  immutable  and  are  viewed  as  the  selfish  desires  of  the  members 
even  if,  ostensibly,  they  express  some  notion  of  altruism  by  substituting  the  preferences 
of  others  for  one’s  own. 

•  It  is  assumed  that  each  member  will  seek  to  maximize  benefit  to  itself,  regardless  of 
the  effect  doing  so  has  on  other  members. 

These  two  assumptions  form  the  basis  of  the  classical  doctrine  of  individual  rationality. 
Perhaps  the  most  well-known  game-theoretic  instantiation  of  this  doctrine  is  the  concept 
of  Nash  equilibria:  a  state  of  mutual  constrained  optimization  for  all  members  in  the  sense 
that  any  member  who  unilaterally  deviates  from  an  equilibrium  state  will  be  less  satisfied. 
Individual  rationality  is  appropriate  for  competitive  social  situations,  but  does  not  provide  a 
framework  within  which  sophisticated  social  relationships  can  be  easily  modeled  and,  hence, 
is  not  well  suited  as  a  model  for  cooperative  multiagent  systems. 

The  social  choice  solution  to  the  multiagent  system  decision  problem  is  to  combine,  or  ag¬ 
gregate,  the  utilities  of  each  individual  to  form  a  social  welfare  function  to  be  maximized.  As 
with  the  game-theoretic  approach,  however,  classical  social  choice  approaches  use  categorical 
utilities,  and  do  not  account  for  social  relationships  among  the  individuals. 

We  introduce  a  significant  departure  from  classical  approaches  to  multiagent  decision 
making.  Our  approach  differs  from  the  classical  formulation  in  three  major  ways. 

1.  Conditioning.  We  relax  the  assumption  that  each  member  of  a  multiagent  system 
possesses  a  total  preference  ordering  over  all  feasible  actions  of  the  collective.  We 
assume,  instead,  that  members  of  a  multiagent  system  are  able  to  modulate  their 
preferences  as  a  function  of  the  preferences  of  others.  To  account  for  this  change,  we 
replace  categorical  utilities  with  conditional  utilities  that  are  designed  to  express  the 
preferences  of  each  individual  as  a  function  of  the  preferences  of  others,  as  appropriate. 

2.  Coherence.  We  invoke  a  weak  notion  of  equity  by  assuming  that  a  minimal  condition 
for  meaningful  negotiations  to  take  place  is  for  each  member  of  the  system  to  have  a 
“seat  at  the  table”  in  the  sense  that  its  interests  at  least  have  a  chance  of  being  taken 
seriously  by  the  group  as  a  whole.  Stated  more  formally,  we  require  the  system  to 
be  coherent,  meaning  that  no  individual  can  be  categorically  subjugated  in  the  sense 
that  every  action  that  is  acceptable  to  the  collective  requires  the  individual  to  be 
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disadvantaged.  Such  an  individual  would  effectively  be  disenfranchised,  and  would 
not  be  in  a  position  to  undertake  meaningful  negotiations.  This  structure  does  not 
eliminate  hierarchical  systems;  rather,  it  simply  means  that,  even  in  master/slave 
relationships,  the  possibility  exists  (but  not  the  guarantee)  that  the  slave’s  preferences 
can  be  acceptable  to  the  master.  For  a  slave  to  be  categorically  subjugated,  every 
action  that  is  good  for  the  master  would  have  to  be  bad  for  the  slave. 

3.  Satisficing.  We  replace  the  notion  of  optimization  with  a  concept  of  being  adequate, 
or  good  enough.  The  terminology  we  use  for  this  concept  is  satisficing.  This  term  was 
initially  introduced  by  Simon  [11-13],  who  addressed  the  question  of  how  a  decision 
maker  might  choose  in  the  presence  of  informational  or  computational  limitations. 
Simon’s  approach  is  to  seek  an  optimal  choice,  but  to  terminate  searching  and  once  the 
decision  maker’s  aspiration  level  has  been  met.  A  slightly  different  notion  of  satisficing 
is  to  accept  the  best  solution  so  far  obtained,  once  the  cost  of  continuing  to  search 
exceeds  the  expected  improvement  in  value  were  the  search  to  continue.  Many  other 
variations  of  this  concept  have  appeared  in  the  literature  and  it  is  not  the  intent  of  this 
report  to  review  them  in  detail.  Suffice  it  to  say,  however,  that  all  of  these  approaches 
view  satisficing  as  a  species  of  bounded  rationality:  one  settles  for  a  solution  that  is 
deemed  to  be  “good  enough,”  but  which  is  not  necessarily,  and  usually  not,  optimal  in 
any  meaningful  sense.  Satisficing  a  la  Simon  is  an  heuristic  approximation  to  the  ideal 
of  being  best  (and  is  only  constrained  from  achieving  this  ideal  by  practical  limitations). 

The  concept  of  satisficing  we  employ,  however,  differs  from  the  afore-mentioned  notions 
in  several  important  ways. 

(a)  In  contrast  to  satisficing  as  advanced  by  Simon  and  others,  it  is  not  heuristic; 
rather,  it  is  a  concept  that  is  as  mathematically  formalized  and  precise  as  is  the 
notion  of  optimization. 

(b)  It  treats  being  good  enough  as  the  ideal  (rather  than  an  approximation)  —  it  is 
not  a  species  of  bounded  rationality. 

(c)  It  naturally  extends  to  the  multi-agent  case,  thereby  providing  a  natural  frame¬ 
work  for  multi-agent  decision  making. 

(d)  It  readily  accommodates  the  extension  of  interests  beyond  the  self,  thereby  ac¬ 
commodating  more  sophisticated  social  relationships  than  self-interest  affords. 

We  retain  the  term  “satisfice”  because,  even  though  our  approach  is  not  heuristic,  we 
nevertheless  seek  solutions  that  are  good  enough,  with  the  essential  difference  being 
that  we  provide  a  non-heuristic  and  mathematically  precise  definition  of  what  it  means 
to  be  good  enough. 

While  optimization  is  intrinsically  an  individual  concept  (if  a  group  is  to  optimize,  it 
must  act  as  an  individual),  satisficing,  as  we  define  it,  is  a  social  concept:  what  is  best 
for  you  may  be  incompatible  with  what  is  best  for  me,  but  what  is  good  enough  for 
you  can  also  be  good  enough  for  me,  provided  we  each  have  some  flexibility  regarding 
what  we  view  as  good  enough. 
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To  motivate  our  concept  of  satisficing,  we  note  that  humans  often  invoke  a  systematic 
approach  to  decision  making  that,  while  still  based  on  quantitative  measures  of  per¬ 
formance,  does  not  correspond  to  optimization.  In  the  vernacular,  the  optimization 
paradigm  corresponds  to  seeking  “the  best  and  only  the  best”  solution.  Also  common, 
however,  is  the  paradigm  of  “getting  your  money’s  worth,”  or  ensuring  that  the  ben¬ 
efits  are  greater  than  the  costs.  This  notion  of  being  good  enough  is  the  satisficing 
paradigm  that  we  advocate.  A  comprehensive  introduction  to  this  perspective  can  be 
found  in  [15]. 


2  Summary  of  Results:  Negotiations 

A  multiagent  system  comprises  a  collective  of  agents  who  must  work  cohesively  to  accomplish 
some  fundamental  objective.  Typically,  however,  such  systems  are  mixed-motive,  in  the 
sense  that  the  interests  of  all  individuals  will  not  all  coincide  perfectly;  hence,  opportunities 
for  both  cooperation  and  competition  will  exist.  The  major  contribution  of  this  study  is 
the  development  of  a  mathematical  framework  that  accommodates  both  cooperative  and 
competitive  aspects  of  a  multiagent  system.  In  this  section  we  briefly  describe  the  three 
main  components  of  our  theory  (conditioning,  coherence,  and  satisficing)  and  show  how  they 
are  used  to  define  a  framework  within  which  to  conduct  negotiations.  Publications  arising 
from  this  research  are  [16, 17],  which  are  included  in  Appendices  A  and  B,  respectively. 

2.1  Conditional  Utilities 

Let  {Xi, . . .  ,Xn},  n  >  2,  denote  a  group  of  autonomous  decision  makers.  Let  A,  denote  a 
finite  set  of  feasible  actions  available  to  Xi,  i  =  1, . . . ,  n,  let  A.  =  Ai  x  •  •  •  x  An  denote  the 
product  action  space,  and  let  a  =  (oi, . . .  ,an)  denote  the  action  profile  that  obtains  when 
each  Xi  instantiates  a*  G  Ai .  A  categorical  utility  for  Xt  is  a  mapping  ux.  '■  A.  — >  M  such  that 
ux.( a)  >  ux.( a')  if  Xt  strictly  prefers  a  to  a'  and  u x  (a)  =  ux  ( a')  if  Xt  is  indifferent  between 
a  and  a'.  Classical  decision-theoretic  approaches, such  as  von  Neumann-Morgenstern  game 
theory,  employ  categorical  utilities  (i.e,  they  are  the  payoffs  of  a  game). 

A  conditional  utility  differs  from  a  categorical  utility  in  that  it  is  a  hypothetical,  rather 
than  a  concrete,  expression.  Before  formally  defining  a  conditional  utility,  we  must  first 
introduce  the  notion  of  a  commitment.  In  the  interest  of  clarity,  we  temporarily  restrict  our 
discussion  to  a  two-agent  system  (Xi,X2).  Now  suppose,  from  the  point  of  view  of  X2  that 
Xi  views  a  =  (ai,a2)  to  be  its  most  preferred  joint  action.  We  shall  call  this  hypothetical 
constraint  on  Xi  a  commitment.  A  commitment,  therefore,  represents  the  antecedent  of  a 
hypothetical  proposition,  the  consequent  of  which  is  a  conditional  utility  denoted  uX2|Xl(-|a). 
More  generally,  for  an  n- agent  system,  if  Xt  is  influenced  by  the  pt  element  sub-collective 
{A'jj, . . . ,  Xip.},  then  the  conditional  utility  of  Xt  is  of  the  form  ux.\Xii,...,xip  (a^a^, . . . ,  a,P( ) . 

In  contrast  to  categorical  utilities,  a  conditional  utility  expresses  JQ’s  preferences  over  A 
given  the  commitments  of  all  other  agents  that  influence  it.  In  the  most  general  case,  each 
agent  would  be  influenced  by  every  other  agent,  but  it  is  often  the  case  that  agents  will  be 
most  heavily  influenced  by  their  immediate  neighbors.  For  example,  hierarchical  organiza¬ 
tions  are  organized  so  that  superiors  influence  subordinates.  Other  multiagent  systems  are 
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organized  into  small  loosely  connected  clusters.  Thus,  although  a  fnlly  connected  system  is 
possible,  many  interesting  mnltiagent  systems  are  relatively  sparsely  connected.  With  this 
project  we  will  focus  on  systems  whose  influence  relationships  can  be  represented  graphically 
with  a  directed  acyclic  graph,  or  DAG.  The  vertices  of  the  graph  represent  the  various  mem¬ 
bers  of  the  collective,  and  the  edges  represent  the  conditional  utilities.  Figure  1  illustrates 
an  influence  network  for  a  five-member  mnltiagent  system.  We  see  that  X\  influences  X2, 
who  in  turn  influences  X3  and  X5.  X5  is  also  influenced  by  X4.  Finally,  X3  is  influenced 
by  X2  and  X5.  Since  Xt  and  X4  are  root  vertices,  they  possess  categorical  utilities  uXl  and 
uX4  (not  shown  on  the  graph).  It  should  be  noted  that,  if  all  utilities  were  categorical,  then 
this  graph  would  have  no  edges  —  it  would  consist  of  n  isolated  vertices,  each  possessing  a 
categorical  utility.  With  this  more  general  model,  only  the  root  vertices  possess  categorical 
utilities,  all  others  possess  conditional  utilities. 


Figure  1:  An  influence  network  for  a  five-member  mnltiagent  system. 


2.2  Coherence 

A  collective  that  possesses  the  property  that  none  of  its  members  can  be  categorically  subju¬ 
gated  is  said  to  be  coherent.  We  have  appropriated  this  term  from  probability  theory,  since 
the  notion  of  avoiding  sure  subjugation  is  completely  analogous  to  the  probabilistic  notion 
of  avoiding  sure  loss.  As  the  Dutch  Book  Theorem  and  its  converse  establish,  the  only  way 
for  a  gambler  to  avoid  a  situation  of  sure  loss  (his  payoff  is  less  than  his  stake  regardless  of 
the  outcome),  is  for  him  to  place  bets  in  accordance  with  the  axioms  of  probability.  One 
of  the  key  results  of  our  investigation  is  to  demonstrate  that,  similarly,  the  only  way  for 
a  member  of  a  collective  to  avoid  categorical  subjugation  is  for  all  utilities  to  possess  the 
mathematical  structure  of  conditional  or  marginal  mass  functions.  Under  this  constraint, 
the  edges  in  Figure  1  are  conditional  mass  functions,  and  the  graph  therefore  possesses  the 
mathematical  structure  of  a  Bayesian  network  (albeit  with  different  semantics).  Convention¬ 
ally,  Bayesian  networks  operate  in  the  epistemological1  domain;  that  is,  involving  random 
phenomena.  To  distinguish  between  the  conventional  probabilistic  application  of  Bayesian 
networks  and  our  praxeological2  application,  we  shall  refer  to  networks  such  as  is  depicted 
in  Figure  1  as  praxeic  networks. 

Epistemology  relates  to  the  categorization  of  propositions  in  terms  of  knowledge  and  belief. 

2Praxeology  relates  to  the  categorization  of  actions  in  terms  of  their  effectiveness  and  efficiency. 
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The  DAG  structure,  coupled  with  the  fact  that  the  edges  are  mass  functions,  permits 
a  natural  way  to  aggregate  the  preference  orderings  of  the  individuals  to  form  a  group 
preference  ordering.  As  is  well  known  from  Bayesian  network  theory,  the  so-called  Markov 
condition,  which  states  that  nondescendent  nonparents  of  a  vertex  have  no  influence  on 
the  vertex,  given  the  state  of  its  parent  vertices  [2],  Accordingly,  just  as  the  multivariate 
probability  mass  function  is  formed  as  the  product  of  the  conditional  and  marginal  mass 
functions  of  a  Bayesian  network,  the  multiagent  utility  of  collective  is  formed  as  the  product 
of  the  conditional  and  marginal  utilities  of  the  praxeic  network.  Thus,  the  multiagent  utility 
associated  with  the  network  illustrated  in  Figure  1  is 

Ux1x2x3xix5  (al  ■  a2>  a3>  a4 1  a5)  = 

uxt  (al)ltx2|x1  (a2  |al  )'U’x3\x2x5  (a3  |a2)  a5 Ky4  ( a4 ) %5 1 x2 x4  ( a5 1  a2 ,  a4)  •  (1) 

More  generally,  let  pa  (A*)  =  (XHJ . . . ,  Xip.)  denote  the  p,  parents  of  A4,  and  let  uXi\ pa(xp 
denote  the  conditional  utility  of  X \  given  its  parents.  If  a  vertex  has  no  parents,  then  the 
conditional  utility  becomes  a  categorical  utility;  that  is,  uXi\  Pa(w)  =  ux.  if  pa  (Aj)  =  0.  The 
multiagent  utility  then  becomes 

n 

Ar-X„(alr"ian)  —  UXi\pa.(Xi)\^i\  CP  (Aj)],  (2) 

i= 1 

where  cp  (A t)  -  {a,,,...,  a  ip.}. 

The  probabilistic  syntax  of  the  utilities  constructed  in  this  way  provides  a  natural  way 
to  link  the  praxeological  and  epistemological  aspects  of  a  decision  problem  into  a  common 
unifying  framework.  To  illustrate,  let  us  modify  the  network  in  Figure  1  by  replacing  vertex 
A5  with  a  random  variable,  9 ,  as  illustrated  in  Figure  2,  where  uX3 ]X2e  is  a  utility  conditioned 
on  the  commitment  of  A2  and  the  value  that  6  assumes  and  pe\x2x4  is  a  probability  mass 
function  conditioned  on  the  commitments  of  A2  and  A4.  The  resulting  multiagent  utility  is 
of  the  form 

'u'x1x2v3x4e(ali  a2 )  a3)  a4 )  9)  =  '^x1(al)Wx2|x1(a2|al)'^x3|x2e(a3|a2)  9)Ux4  {H-4)Pe\X2x4  ($i  a2i  a4); 

(3) 

where  d  is  the  value  assumed  by  the  random  variable  9.  The  expected  utility  is  then  obtained 
by  averaging  over  the  values  that  9  may  assume,  yielding 

'^X1X2X3X4(al)  a2 1  a3;  a4)  =  'y  ]  u,Y1.Y2A'3x4«(al)  a2 1  a3i  a4 )  ’9)-  (4) 

This  result  extends  to  the  general  n-dimensional  case  in  the  obvious  way. 

The  most  general  formulation  of  this  framework  assumes  that  each  individual’s  utility 
is  defined  over  the  product  action  space  *4.,  given  the  commitments  of  each  of  its  parents 
to  action  profiles  in  A.,  as  is  presented  above.  For  many  applications,  however,  this  full 
generality  is  not  necessary,  since  it  is  often  reasonable  to  assume  that  agents’  utilities  are 
defined  with  respect  to  their  own  actions,  given  commitments  by  others  to  only  their  own 
actions.  Thus,  we  introduce  the  notion  of  decoupling. 
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Figure  2:  A  praxi-epistemic  network  for  a  four-member  multiagent  system. 

Definition  1  A  multiagent  system  is  conditionally  decoupled  if  the  conditional  preference 
of  each  agent  is  a  function  only  of  its  own  actions,  given  the  commitments  of  its  parents  to 
their  own  actions. 

For  a  decoupled  multiagent  system,  (3)  becomes 

UX1X2X3X4e((ll,  0^2)  &3>  &4)  l9)  =  UXl  (fll)Ux2\X1  {^2\^l)'^X3\X29{,ad\a2)  hf)Ux4(a<f)pS\x2X4  ($;  ®2)  O4). 

(5) 

In  general,  the  individual  conditional  utilities  are  of  the  form 

uxi\pa.(xi)[ai\ CP  (A’j)]  =  Uxi\pa(xi)(ai\aiij  ■  ■  ■  T  aiPi)  (6) 

where  the  action  sub-prohle  {a^, . . . ,  aip. }  corresponds  to  the  commitments  by  pa  (Ah)  = 
(Ajj  . . . ,  X j  },  and  the  multiagent  utility  is  of  the  form 

n 

uXl  -  xn(cLi, . . . ,  an )  =  J^J'Ux.|pa(x.)(ai|ai1, . . . ,  Uip.),  (7) 

i— 1 

For  the  remainder  of  this  report  we  focus  on  decoupled  systems. 

2.3  Satisficing 

Even  though  optimization  is  often  taken  as  the  sine  qua  non  of  of  formalized  decision-making 
procedures,  humans  are  often  wont  to  evaluate  propositions  in  terms  of  the  upside  versus 
the  downside,  the  pluses  versus  the  minuses,  the  benefits  versus  the  costs,  and  so  forth.  One 
of  the  important  omissions  in  the  extant  literature  is  a  systematic  formal  treatment  of  this 
mode  of  evaluating  possible  choices.  An  important  result  of  earlier  research  by  the  principal 
investigator  is  the  introduction  of  a  formalized  mathematical  treatment  of  this  alternative 
mode  of  decision  making.  It  should  be  noted  that  this  approach  has  been  inspired  by  the 
work  of  the  philosopher  Isaac  Levi  [6],  who  proposed  a  novel  way,  using  the  mathematics  of 
probability  theory,  to  improve  one’s  knowledge.  In  [15],  the  principal  investigator  applied 
Levi’s  approach  to  the  praxeological  domain  and  extended  it  to  the  multiagent  case. 

Conventional  utilities  combine  all  costs  and  benefits  of  taking  action  into  a  single  function. 
One  common  approach  is  to  define  utility  as  a  linear  combination  of  those  aspects  of  taking 
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action  that  relate  to  the  effectiveness  (benefits)  of  taking  an  action  and  those  aspects  that 
relate  to  the  inefficiency  (costs)  of  taking  the  action.  In  practice,  the  weights  of  these  two 
facets  of  taking  action  become  tuning  parameters  to  facilitate  the  design  of  a  system  that 
provides  acceptable  performance  (at  the  end  of  the  day,  even  optimization  is  subjective). 

Many  theorists  (e.g.,  [1,3, 5,  7])  have  argued,  however,  that  it  is  unwise  to  aggregate  con¬ 
flicting  interests  into  a  single  preference  ordering.  Some  have  asserted  that  in  a  social  setting 
individuals  have  multiple  facets,  as  defined  by  Steedman  and  Krause  [14] ,  who  maintain  that 
an  agent,  although  an  indivisible  unit,  nevertheless  is  capable  of  considering  its  choices  from 
different  points  of  view,  and  that  separate  utilities  may  be  defined  to  correspond  to  each 
facet  of  an  individual.  A  natural  way  to  classify  attributes  is  according  to  their  effectiveness 
and  efficiency.  Each  individual  may  be  viewed  as  being  composed  of  two  facets:  the  selecting 
facet ,  which  evaluates  actions  in  terms  of  effectiveness  toward  pursuing  objectives  without 
concern  for  efficiency,  and  the  rejecting  facet,  who  evaluates  actions  in  terms  of  efficiency 
with  respect  to  consuming  resources  without  concern  for  effectiveness.  We  shall  view  these 
selecting  and  rejecting  facets  as  the  “atoms”  of  the  system.  Notationally,  we  define  Si  and 
Ri  as  the  selecting  and  rejecting  facets,  respectively,  of  X, 

Accordingly,  we  define  separate  utilities  for  the  selecting  facet  and  the  rejecting  facet. 
In  accordance  with  the  conditioning  and  coherence  properties,  these  utilities  are  conditional 
mass  functions.  Each  agent  has  a  unit  of  selecting  utility  to  apportion  among  the  feasible 
actions  and  a  unit  of  rejecting  inutility  also  to  apportion.  An  n- agent  system  thus  comprises 
2 n  atoms:  n  selecting  facets  and  n  rejecting  facets,  and  the  graph  of  such  a  system  comprises 
2 n  praxeic  vertices  whose  edges  are  conditional  utilities.  Figure  3  illustrates  a  refinement, 
in  terms  of  the  facets,  of  the  influence  relationships  originally  defined  by  Figure  2.  This 
network  reveals  more  explicitly  just  how  the  agents  influence  each  other.  We  see  that  Si 
influences  R-2,  £4  influences  9,  and  so  forth.  Also,  facets  R.\  and  R4  are  not  influenced  by 
any  other  facets  and  hence,  in  addition  to  Si,  S2,  and  £4,  are  root  nodes. 


Figure  3:  A  Satisficing  network  for  a  four-member  multiagent  system. 

According  to  the  fundamental  property  of  Bayesian  networks,  we  may  form  the  multiagent 
utility  as  the  product  of  all  marginal  and  conditional  utilities,  yielding 

US1S2S3S4R1R2R3R4o{a'l,  Q'2’  °3>  °4)  a2i  a3i  04>  = 

US1  (al  )US2  {a2)'U’S3\R2e(d3\a2i  $)US4(a  4) 

uRl  (a'i)uR2\Sl(a'2\ai)uR3lS2  (a'3\a2)uRi  (a'^pg^  ($\a4,a'2),  (8) 
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and  the  expected  utility  is 

^SiS2  S3  S4R1R2  R3  R4  (ai,  a2,  a3,  a4,  a[ ,  af,,  a(j,  a4)  — 

^  ^  WS1S2S3S4H1fl2fl3«40(ai’  °2  )  °3l  ®4  )  °1 )  °2  >  a3  )  °45  '^)-  (9) 

1? 

This  expected  utility  is  called  the  aggregation  function.  Analogous  to  the  way  a  joint 
probability  distribution  captures  all  of  the  interdependencies  among  multiple  random  vari¬ 
ables,  the  aggregation  function  captures  all  of  the  inter-relationships  among  the  facets  of  a 
multiagent  system. 


2.4  Negotiation 

2.4.1  Optimal  Compromise 

The  three  components  of  conditioning,  coherence,  and  satisficing  provide  a  framework  within 
which  members  of  multiagent  system  can  negotiate  and  compromise.  The  key  feature  that 
enables  this  ability  is  that  the  satisficing  approach  provides  a  set  of  acceptable  actions,  rather 
than  a  singleton  set  comprising  the  optimal  action. 

For  a  non-decoupled  system,  the  utilities  are  functions  of  the  entire  action  profile,  but  for 
a  decoupled  system,  the  utilities  are  functions  of  individual  actions.  With  this  restriction, 
the  multiagent  utility  function  becomes 

n  n 

Us1-SnR1-Rn{a  1?  •  •  •  an,  0*1,  ■  ■  ■  ,  O-n)  =  Us.  |  pa  (Sj)  [cij  |  Cp  (Sj)]  IK-AWp  («,)].  (10) 

*= 1  3= 1 


where  cp  (Sf)  and  cp  (i?j)  denote  the  commitments  by  pa  (S'*)  and  pa  (/?,;),  respectively. 
The  corresponding  joint  selectability  and  rejectability  marginals  are  given  by 


•  •  •  ,  0>n)  ~  ^  USi~SnRi~Rn(aU  •  •  •  ,  an,[ 


and 


u 


Ri  -Rn(ali  ■  ■  ■  ian)  ~  US1SnR1-Rn(al,---,an,0,1,. 

We  may  now  define  a  social  welfare  function  as 


)  ® r 


■  ■  t  ®n) ■ 


(11) 


(12) 


W(oi,  .  .  .  ,  a„)  —  USl...Sn(o,l,  •  •  •  ,  On)  Qgu Rv  ■Rn{al  ■)■■■■>  an)  (13) 

where  qG  G  [0, 1]  regulates  the  threshold  for  rejecting  elements  of  A.  Nominally,  qG  =  1,  but 
as  we  shall  see,  this  parameter  serves  as  negotiating  parameter.  The  jointly  satisficing  set  is 
the  set  of  action  profiles  that  are  jointly  satisficing  for  the  system  as  a  whole,  and  is  defined 
as 

S  =  {(oi, . . .  ,an)  G  A:  W (oi, . . . ,  an)  >  0}.  (14) 
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This  set,  however,  does  not  account  for  the  possibility  that  the  elements  of  S  may  not  be 
acceptable  to  all  (or  any)  of  the  individuals.  Thus,  we  must  also  compute  the  individual  sat¬ 
isficing  sets.  To  proceed,  we  must  first  compute  the  selectability  and  rejectability  marginals 
as 

Usiiai)  =  y^M5l...gn(ai, . .  •  ,an)  (15) 

— 'O'i 

and  _ 

Mfli(oi)  =  ^2uRi -Rniai,  ■  ■  -,an),  (16) 

-'CLi 

respectively,  where  the  notation  the  so-called  “not  sum”  notation  meaning  the  sum 

is  taken  over  all  elements  except  at. 

We  define  the  individually  satisficing  sets  as 

Si  =  {a.i  G  Af  us.(ai )  -  qiUR.(ai)},  (17) 

where  qt  G  [0, 1]  is  Xt~'s  individual  negotiation  index.  This  set  includes  all  alternatives 
that  are  satisficing,  or  good  enough,  for  X,  at  the  given  negotiation  index.  The  satisficing 
rectangle  is  the  set  of  all  action  profiles  such  that  each  component  is  individually  satisficing, 
and  is  given  by 

1Z  =  Sx  x  •  •  •  x  En.  (18) 

The  intersection  of  the  jointly  satisficing  set  and  the  satisficing  rectangle  yields  the  compro¬ 
mise  set,  comprising  the  action  profiles  that  are  simultaneously  good  enough  for  the  group 
and  for  each  individual. 

C  =  Snn.  (19) 

If  C  0,  then  we  may  form  a  best  compromise  as 

a*  =  arg  max  W  (a) .  (20) 

a  ec 

If  C  =  0,  then  there  are  no  action  profiles  that  are  simultaneously  good  enough  for 
the  group  and  each  individual.  However,  the  satisficing  approach  provides  a  natural  and 
systematic  negotiation  framework  by  which  each  individual  may  control  the  degree  to  which 
it  is  willing  to  lower  its  standards  in  an  attempt  to  reach  a  compromise.  By  lowering  its 
qt- value  incrementally,  each  Xt  increases  the  size  of  its  satisficing  set.  By  specifying  the 
increment  A q.t  that  X,  is  willing  to  reduce  its  standards,  each  participant  can  control  the 
amount  of  compromise  it  is  willing  to  offer  others.  If  enough  participants  are  willing  to 
lower  their  q- values  sufficiently,  it  is  easy  to  see  that,  eventually,  the  consensus  set  will  be 
non-empty,  and  a  best  compromise  can  be  achieved.  Although  such  negotiations  may  fail 
to  reach  a  compromise  that  is  acceptable  to  all  members,  the  significant  aspect  of  this  type 
of  negotiation  is  that  no  individual  is  a  priori  subjugated  to  the  will  of  the  collective  in  the 
sense  that  there  is  no  possibility  for  that  individual’s  preferences  to  receive  consideration. 
Thus,  every  individual  can  be  assured  of  receiving  sufficient  benefit,  by  its  own  definition, 
before  agreeing  to  the  compromise. 
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2.4.2  Nash  Bargains 

A  bargaining  game  is  a  cooperative  game  in  which  each  participant  possesses  a  disagreement 
point  that  defines  the  benefit  that  is  guaranteed  to  accrue  to  it  if  a  compromise  cannot 
be  reached.  A  well-known  bargaining  concept  that  offers  a  clear  definition  of  individual 
acceptability  is  the  Nash  bargain  [8],  which  permits  each  participant  to  make  maximal  use 
of  its  strategic  strength.  Let  dx.  denote  the  disagreement  point  for  A,:.  The  negotiation  set , 
denoted  AT,  is  the  subset  of  action  profiles  such  that  every  participant  achieves  at  least  its 
disagreement  point.  In  terms  of  categorical  utilities,  the  negotiation  set  is 

AT  =  {a  e  A:  ux.( a)  >  dx.,  i  =  l,...,n}  (21) 

and  the  Nash  bargain  is 

n 

ajv  =  arg  max  TT  [ux  (a)  -  dx  ] .  (22) 

aeA/ 

i= 1 

The  intuitive  interpretation  of  a  Nash  bargain  is  that  it  defines  a  fair  compromise.  It  enables 
each  player  to  take  advantage  of  the  strategic  strength  endowed  by  its  disagreement  point. 
The  higher  Ah’s  disagreement  point,  the  more  action  profiles  that  are  unfavorable  to  it  are 
eliminated. 

The  structure  of  (22)  suggests  that  the  optimal  group  solution  can  be  interpreted  as  a 
Nash  bargain  with  unilateral  utilities  replaced  by  conditional  utilities  and  all  disagreement 
points  set  to  zero.  Analogously,  therefore,  we  may  define  a  conditional  Nash  bargaining 
solution.  When  decisions  are  made  under  certainty,  the  negotiation  set  is  defined  as 

Af  =  {a  e  A:  ux. |Pa(xi)(a)|  cp  (X \)  >  dx.,  i  =  1, . . .  ,n}.  (23) 

The  conditional  Nash  bargaining  solution  is 

n 

aN  =  arg  max  JJ  [ux.}  pa(Xi)  [a|  cp  (Ah)]  -  dx.] .  (24) 

i= 1 

3  Summary  of  Results:  Attitude  Adaptation 

An  important  benefit  of  the  satisficing  approach  is  that  cooperation  occurs  much  more  readily 
than  under  standard  utility-maximization.  To  examine  this  phenomenon  more  closely,  we 
studied  the  emergence  of  cooperation  using  evolutionary  game  theory.  Evolutionary  game 
theory  [20]  studies  large  populations  of  players  whose  reproductive  potential  is  determined  by 
the  payoff  gained  during  play.  For  infinitely  large,  well-mixed  populations,  the  evolution  of 
the  population  is  described  by  the  replicator  dynamics  [18].  In  the  simplest  case,  all  players 
have  the  same  action  space  A,  and  are  paired  with  one  other  player  each  “round.”  Let  Xi(t) 
be  the  fraction  of  the  population  playing  strategy  a*  €  A  at  time  t.  Then,  the  population 
shares  evolve  according  to  the  following  system  of  differential  equations: 

Xi(t)  =  [u(ai?x(f))  —  it(x(i),x(i))]xj(i),  Vi,  (25) 

where  it(aj,x(i))  is  the  expected  utility  of  playing  strategy  a*  against  a  player  randomly 
drawn  from  the  population  described  by  x(t),  and  u(x(t),  x(f))  is  the  average  expected  utility. 
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Essentially,  a  strategy’s  population  share  grows  or  shrinks  if  it  fares  better  or  worse  than 
average,  respectively.  Given  appropriate  initial  conditions,  the  steady-state  of  the  replicator 
dynamics  is  a  Nash  equilibrium. 

To  apply  evolutionary  dynamics  to  the  satisficing  case,  we  note  that  players’  conditional 
utilities  are  often  expressed  in  terms  of  tunable  parameters  that  govern  (for  example)  players’ 
willingness  to  cooperate  or  defer  to  the  preferences  of  others.  We  term  these  parameters 
attitudes  and  study  how  players  might  adapt  their  attitudes  in  order  to  increase  payoff. 
Instead  of  running  the  replicator  dynamics  on  players’  actions  (as  in  the  classical  case), 
we  run  the  replicator  dynamics  on  players’  attitudes,  allowing  us  to  study  the  ecological 
fitness  of  exhibiting  a  particular  attitude.  This  dynamics  leads  the  players  to  an  attitude 
equilibrium ,  a  point  at  which  no  player  can  improve  its  payoff  by  changing  its  attitudes. 

As  a  concrete  example,  we  focus  on  the  well-studied  Stag  Hunt  game,  which  involves  two 
players.  They  can  catch  a  stag  but  cooperating,  but  each  can  catch  a  (much  smaller)  hare 
alone.  That  is,  a  player  earns  maximum  payoff  if  both  players  cooperate,  but  risks  failure  if 
it  attempts  to  cooperate  while  the  other  does  not.  Under  the  standard  replicator  dynamics, 
the  population  ends  up  entirely  non-cooperative  (hunting  hare)  unless  a  significant  majority 
of  the  population  initially  hunts  stag.  So,  it  is  impossible  under  this  framework  to  evolve  a 
cooperative  population  from  non-cooperation. 

We  applied  satisficing  theory  to  see  if  we  could  do  any  better.  We  developed  a  simple 
satisficing  model  for  the  Stag  Hunt  and  applied  the  replicator  dynamics  to  the  players’ 
attitudes.  Under  the  satisficing  model,  cooperation  is  significantly  easier  to  achieve  than 
under  the  standard  model.  Indeed,  the  population  evolves  toward  cooperation  even  when 
only  10%  of  the  initial  population  hunts  stag.  This  study  is  detailed  in  [9],  which  is  included 
in  Appendix  C. 
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Abstract — The  design  of  automated  multiagent  cooperative 
systems  can  be  greatly  facilitated  by  the  use  of  conditional  utili¬ 
ties,  which  provide  each  individual  the  capability  of  modulating 
its  interests  as  a  function  of  the  interests  of  others.  Perhaps  the 
weakest  possible  requirement  for  meaningful  coordination  is 
that  the  group  be  coherent:  no  individual  is  required,  under 
all  circumstances,  to  sacrifice  its  own  welfare  to  benefit  the 
group.  When  the  influence  relationships  among  the  members 
of  a  group  can  be  expressed  via  a  directed  acyclic  graph,  a 
group  is  coherent  if  and  only  if  its  utilities  are  conditional 
mass  functions.  This  structure  permits  the  performance  aspects 
to  be  merged  with  the  random  aspects  to  form  a  unified 
mathematical  framework  for  decision  problems  under  risk.  The 
resulting  solution  may  be  interpreted  as  the  Nash  bargaining 
solution  when  the  disagreement  points  of  all  agents  are  set  to 
zero.  Coherence  is  shown  to  be  operationally  equivalent  to  the 
concept  of  symmetry  for  a  cooperative  game.  The  resulting 
theory  is  designed  to  account  for  both  individual  and  group- 
level  preferences. 

I.  Introduction 

Many  multiagent  decision  problems  require  the  decision 
makers  to  cooperate  to  achieve  the  goals  of  the  collective.  A 
purely  cooperative  collective  is  one  in  which  the  interests  of 
all  individuals  coincide  perfectly.  Many  collectives,  however, 
are  mixed-motive,  and  opportunities  for  both  conflict  and 
cooperation  exist.  A  key  question  in  such  cases  is  the 
definition  of  what  it  means  to  be  rational.  Historically,  this 
question  has  been  addressed  from  two  distinct  points  of 
view:  game  theory  and  social  choice  theory.  Under  game 
theory,  each  individual  seeks  to  optimize  its  own  perfor¬ 
mance,  whereas  in  the  social  choice  context,  the  goal  is 
to  maximize  performance  of  the  group  as  a  whole.  In  the 
former  case,  the  value  of  the  individual  decisions  to  the 
group  is  not  explicitly  considered,  and  in  the  latter  case, 
although  the  value  judgments  of  the  individuals  may  be  used 
to  define  group-level  performance,  there  is  no  assurance  that 
the  resulting  decision  will  maximize  the  performance  of,  or 
even  be  acceptable  to,  any  given  individual. 

The  reconciliation  of  these  two  extreme  perspectives  is 
an  important  theoretical  objective,  both  for  human  decision 
making  and  for  the  design  of  artificial  decision-making 
entities  who  must  cooperate.  An  important  design  principle 
for  such  scenarios  is  that  the  agents  function  according  to 
a  mathematical  framework  that  is  coherent  in  the  sense 
that  no  individual  can  be  categorically  subjugated;  i.e.,  is 
required  in  all  situations  to  sacrifice  its  own  welfare  to 
benefit  the  group.  If  an  agent  were  so  required,  it  would  not 
enjoy  even  an  exiguous  sense  of  equity — it  would  effectively 
be  disenfranchised.  Coherence  is  a  minimal,  yet  critical. 


property  of  a  collective  that  is  capable  of  sophisticated  social 
behaviors  such  as  negotiation,  compromise,  and  altruism. 

II.  Modeling  Fundamentals 

Let  {Xi, . . . ,  Xn } ,  n  >  2,  denote  a  group  of  autonomous 
decision  makers.  Let  A,  denote  a  finite  set  of  feasible  actions 
available  to  Xt,  i  =  1, . . . ,  n,  let  A  =  Ai  x  •  •  •  x  An  denote 
the  product  action  space,  and  let  a  =  (ai, . . . ,  an)  denote  the 
action  profile  that  obtains  when  each  Xt  instantiates  ai  £  Ai. 

Classical  multiagent  decision  theory  assumes  that  each 
individual  possesses  a  total  preference  ordering  over  all 
action  profiles.  Under  this  assumption,  each  Xt  possesses  a 
utility  uXi :  A  — ►  TZ  such  that  uXi  (a)  >  ux.  (a')  if  Xi  prefers 
a  to  a',  and  ux.  (a)  =  uXi( a')  if  X,  is  indifferent  between 
a  and  a'.  These  utility  functions  are  assumed  to  provide 
a  complete  and  immutable  description  of  the  valuations  of 
actions  for  the  collective.  Generally,  they  are  provided  as 
part  of  the  problem  statement  and,  once  defined,  the  logic 
used  to  arrive  at  these  orderings  is  assumed  to  be  irrelevant 
to  the  actual  decision-making  enterprise.  We  will  term  these 
functions  categorical  utilities  since  they  are  unconditional 
valuations  of  decision-maker  preferences. 

The  categorical  model  of  preferences,  however,  restricts 
the  ability  of  individuals  to  modulate  their  preferences  by 
giving  deference  to  others  under  specific  situations.  For 
example,  consider  a  collective  that  possesses  a  hierarchical 
structure  such  that  X\  dominates  Xi  in  some  functional 
way.  In  such  a  case,  Xi  may  need  to  adjust  its  preferences 
according  to  the  preferences  of  Xi,  but  could  not  do  so  with 
a  categorical  preference  ordering.  Instead,  Xi  would  possess 
a  set  of  conditional  preference  orderings,  each  depending  on 
the  hypothetical  assumption  of  Xi’s  preferences.  We  may 
represent  this  set  of  conditional  preferences  by  a  set  of 
conditional  utilities  uX2lXl  such  that  uX2\Xl(ai\a\)  is  the 
utility  that  Xi  ascribes  to  ai  €  Ai  given  the  hypothetical 
assumption  that  X\  is  committed  to  action  ai  £  A\. 
This  hypothetical  commitment  serves  as  the  antecedent  to  a 
hypothetical  proposition  whose  consequent  is  the  conditional 
utility.  A  hypothetical  commitment  may  take  many  forms,  but 
perhaps  the  most  important  one,  with  respect  to  the  social 
interaction  of  the  agents,  is  that,  from  the  perspective  of  Xi, 
X\  considers  ai  £  A  \  to  be  its  most  preferred  action.  Under 
this  interpretation,  Xi  is  in  a  position  to  give  deference  to 
X\  by  adjusting  its  conditional  utility  in  a  way  that  benefits 
(or,  in  a  malevolent  scenario,  injures)  X\ . 

In  general,  if  every  agent  influences  every  other  agent 
(a  fully  connected  group),  then  every  agent’s  utility  would 


be  conditioned  on  every  other  agent’s  hypothesized  commit¬ 
ments.  It  is  often  the  case,  however,  that  individual  members 
of  a  group  are  most  strongly  influenced  by  their  neighbors 
(functionally,  spatially,  or  temporally).  For  example,  hier¬ 
archical  groups  possess  a  distinctive  “top-down”  structure, 
where  the  preferences  of  subordinate  agents  are  influenced 
by  their  superiors.  A  hierarchical  structure  is  a  special  case 
of  more  general  “Markovian”  structures  that  are  amenable  to 
graphical  analysis.  A  graphical  structure  that  has  been  shown 
to  be  effective  in  many  situations  is  a  directed  acyclic  graph , 
or  DAG.  DAGs  provide  a  convenient  and  powerful  language 
with  which  to  encode  influence  relationships — the  most  well 
known  being  so-called  Bayesian  networks ,  which  are  used 
extensively  for  the  design  of  artificially  intelligent  systems 

[1-3]. 

A  directed  graph  is  a  pair  Q  =  (X,  E),  where  X  = 
(Xi, . . .  ,Xn)  is  a  finite  set  set  of  vertices  and  £  is  a  set 
of  directed  edges  linking  pairs  of  vertices.  If  Xj  is  directly 
influenced  by  Xi ,  then  there  is  a  directed  edge,  denoted  »” 
from  Xi  to  Xj.  A  path  from  Xi  to  Xj  is  a  sequence  of 
vertices  {Xj, Xkl,Xk2, .... Xj}  such  that  Xj  ->  Xkl  -> 
Xfc2  — »  •  •  •  — >  Xj.  We  write  Xj  i— >  Xj  if  there  is  a  path 
from  Xj  to  Xj.  If  there  are  no  paths  such  that  Xj  i— >  Xj  for 
any  i,  the  graph  is  said  to  be  acyclic. 

If  Xj  — >  Xj,  then  Xi  is  called  a  parent  of  Xj,  and  Xj  is 
a  child  of  Xj.  The  set  of  parents  of  Xj  is  denoted  pa  (Xj)  = 
{Xj . :  Xtj  — >  Xj,  j  =  1 . . . ,  pi},  and  the  set  of  children  of  Xj 
is  denoted  ch  (Xj)  The  descendents  of  Xj,  denoted  de  (Xj), 
is  the  subset  of  vertices  {Xjm :  Xj  Xjm ,  m  =  1 ...  ,di}. 

A  fundamental  property  of  a  DAG  is  the  Markov  condi¬ 
tion:  nondescendent  nonparents  of  a  vertex  have  no  influence 
on  the  vertex,  given  the  hypothesized  commitments  of  its 
parent  vertices.  Suppose  pa(Xj)  =  {Xj}.  By  the  Markov 
condition,  Xj’s  utility  is  therefore  a  function  only  of  the 
pair  ( Xi,Xj ).  In  general,  suppose  Xj  has  p,  parents,  denoted 
pa(Xj)  =  {Xjj , . . . ,  Xjp.  }.  For  any  action  profile  a  = 
(ai, . . . ,  an),  let  aj  =  (a^ ,  ■  ■  ■ ,  dj  )  denote  the  sub-profile 
of  a  corresponding  to  pa(Xj).  We  may  then  express  the 
utility  of  a  to  Xj  as 

UXi  (a)  =  Uxp  pa(Xj)  (Oi|ai)>  (1) 

the  conditional  utility  of  Xj  given  the  action  sub-profile  of 
its  parents.  If  pa(Xj)  =  0,  then  its  utility  is  not  influenced 
by  the  commitments  of  any  other  agent.  Its  utility  is  then 
marginal  and  is  of  the  form  uXi  (oi, . . . ,  an)  =  uXi  («i)-  The 
conditional  utilities  constitute  the  edges  of  the  DAG. 

The  conditional  and  marginal  utility  structures  provide 
an  important  mechanism  by  which  an  agent  may  assess  its 
preferences.  Whereas  the  general  structure  ux.(ai, . . .  ,an) 
requires  Xi  to  specify  its  preferences  over  all  action  profiles 
(ai, . . . ,  an),  the  conditional  approach  requires  Xj  to  specify 
its  preferences  only  over  its  own  action  space,  given  each 
possible  action  of  its  parents.  Thus,  the  agent  is  required  to 
define  its  conditional  preference  ordering  with  respect  only  to 
the  actions  of  itself  for  each  hypothetical  situation  regarding 
its  parents.  Although  this  structure  can  be  generalized,  in  this 
paper  we  restrict  attention  to  collectives  where  at  least  one 


agent’s  preferences  are  not  conditioned  on  the  commitments 
of  any  other  agent. 

Example  2.1:  Consider  a  collective  involving  three  agents 
with  the  hierarchical  structure  illustrated  in  Figure  1.  Xi 
is  the  primary  agent  (in  the  sense  that  it’s  mission  is  most 
critical  to  the  success  of  the  enterprise);  Xi  and  X3  are 
the  secondary  and  tertiary  agents,  respectively.  We  observe 
that  pa(Xi)  =  0,  pa(X2)  =  {Xi},  and  pa(Xg)  = 
{Xi,  X2}.  As  a  specific  illustration,  suppose  Xi’s  concern  is 
the  appropriate  market  sector  of  a  product  to  be  manufactured 
(either  ai,  the  affluent  customers,  or  a\,  the  less  prosperous 
consumers).  Given  the  sector,  X2’s  concern  is  to  decide 
which  product  to  manufacture  (either  widgets  d2,  or  gizmos 
a'2.  Finally,  given  the  sectors  and  the  product,  Xg’s  concern  is 
to  choose  which  grade  of  materials  to  use  (either  high  quality 
d3,  or  low  quality  a'3).  Thus,  A,  =  |dj,d'},  i  =  1,2,3. 
The  product  action  space  A.  =  Ai  x  Ai  x  ^3  contains 
eight  action  profiles.  The  three  agents  must  cooperate  to 
achieve  maximum  productivity  and  hence  must  coordinate 
their  choices.  The  corresponding  utilities  are  uXl:  A\  — ►  1Z, 
uX2\Xl-  Ai  x  A±  >  1Z,  and  uX3 \XlX2:  A3  x  Ai  x  Ai  » 1Z. 
The  issue  facing  this  group  is  to  use  these  three  utility 
structures  to  formulate  a  plan  that  is  acceptable  individually 
as  well  as  for  the  group. 


Fig.  1.  The  influence  network  for  a  three-agent  hierarchy. 


The  conditional  structure  permits  agents  to  exhibit  condi¬ 
tional  altruism  by  defining  their  preference  orderings  as  a 
function  of  the  preferences  of  others.  For  example,  suppose 
uXl  (a)  uXl  (a').  X2  could  reinforce  this  strong  preference 
by  setting  MX2,Xla|a)  »  uX2\Xl{ a'|a),  thereby  deferring  to 
the  preferences  of  Xp.  This  type  of  altruism,  however  is 
not  categorical,  since,  conditioned  on,  say,  a  commitment  by 
Xi  to  a",  X2  need  not  prefer  a  to  a'.  Conditional  altruism 
thus  provides  decision  makers  with  a  natural  vehicle  with 
which  to  establish  sophisticated  social  relationships  that  can 
enhance  the  possibilities  for  compromise  and  negotiation.  For 
example,  X2  can  use  its  conditional  utility  as  a  parameter 
with  which  to  adjust  the  amount  of  deference  it  is  willing 
to  grant  Xi  to  effect  a  compromise.  Conversely,  X2  can  use 
its  conditional  utility  to  threaten  or  punish  X\  by  reducing 
its  utility  of  actions  that  are  beneficial  to  Xi  and,  thereby, 
reducing  the  utility  of  that  action  to  the  group  (e.g.,  through 
aggregation,  as  will  be  discussed  shortly). 

III.  Group-level  Rational  Decisions 

The  study  of  how  individual  preferences  are  used  to  form 
a  group  decision  is  the  central  issue  of  social  choice  theory 
[4-6],  and  hence  is  relevant  to  the  study  of  autonomous 
multiagent  decision  making  groups.  Social  choice  theory 


has  traditionally  been  applied  to  human  societies,  but  the 
concepts  are  directly  applicable  to  artificial  societies  as  well, 
particularly  those  that  are  intended  to  function  cooperatively. 
A  key  issue  of  this  theory  is  how  to  aggregate  the  interests  of 
individuals  to  form  a  group  decision  in  a  democratic  fashion; 
i.e.,  in  a  way  such  that  the  interests  of  all  individuals  are 
respected  and  given  equitable  consideration. 

Informally,  a  society  is  coherent  if  each  member  has  a 
“seat  at  the  table”  in  the  sense  that  the  possibility  exists 
(although  not  the  guarantee)  that,  for  each  of  the  individuals, 
a  solution  exists  that  is  good  for  the  group  and  is  also 
good  for  that  individual.  Obviously,  most  voting  schemes  are 
transparently  coherent  (assuming  each  voter’s  most  preferred 
candidate  is  on  the  ballot),  but  when  complex  influence  rela¬ 
tionships  exist  among  the  members  of  a  group,  establishing 
coherence  may  not  be  obvious. 

To  formalize  the  notion  of  coherence,  let  us  assume  that  Xi 
is  able,  after  taking  into  consideration  all  social,  economic, 
and  political  relationships  between  it  and  other  agents,  to 
define  a  utility  ux.  over  its  action  space.  We  also  assume  that 
the  group  possesses  a  group-level  utility  uXl,...,xn-  A^>  1Z. 

Definition  3.1:  Let  ux.  denote  Xfs  categorical  utility, 
i  =  1  ,...,n,  and  let  uXl  Xn  denote  the  utility  of 

the  group  X  =  {Xi, . . . ,  Xn}.  X  is  coherent  if,  given 
that  ux.(cii)  >  ux.(a'i),  there  exists  an  action  sub-profile 
«, . . . ,  a*_l5  a*+1,  such  that 

uXl  x„(ct i,  •  •  • ,  ai_ i,  cti,  ,  an)  > 

uXl...Xn  «,•••,  a*_1;  a-,  a*+1, . . . ,  <). 

If  there  does  not  exist  such  a  sub-profile,  then  Xi  is  in 
a  position  of  categorical  subjugation :  every  action  profile 
that  contains  its  most  preferred  action  is  dominated  by 
profiles  that  do  not  contain  its  most  preferred  action.  In 
terms  of  voting,  incoherence  means  that,  no  matter  how 
the  others  vote,  Xfs  candidate  will  lose.  Effectively,  X, 
is  disenfranchised.  Categorical  subjugation  is  similar  to  the 
notion  of  suppression  as  discussed  by  [7]  and  [5]. 

The  question  thus  becomes;  what  constraints  must  be 
placed  on  the  utilities  to  ensure  that  a  condition  of  categorical 
subjugation  is  impossible?  To  address  this  question,  let  us 
turn  to  an  analogous  issue.  A  Dutch  book  is  a  gambling 
situation  such  that,  no  matter  what  the  outcome,  the  gambler 
will  be  worse  off  for  having  taken  the  gamble — a  situation 
of  sure  loss  (one’s  reward  is  always  less  than  one’s  stake). 
To  illustrate  a  Dutch  book.  Suppose  Y  can  take  one  of 
two  distinct  values;  yi  or  y2,  and  let  q{y)  denote  a  belief 
function1  of  y;  i.e.,  q(y)  measures  the  strength  of  belief  that 
Y  =  y. 

By  convention,  we  will  assume  that  we  have  full  belief 
that  exactly  one  of  these  values  obtains — the  disjunction  of 
yi  and  y2  must  occur.  We  further  assume  that  beliefs  are 
additive,  thus,  9(2/1  V  y2)  =  9(2/1)  +  9(2/2)  =  1-  Now  let  Z 
take  on  one  of  two  distinct  values  z\  or  z2,  and  let  r(y,z) 
denote  the  belief  that  Y  =  y  and  Z  =  z  simultaneously.  Let 

'We  refrain  from  using  the  term  “probability’'  here,  since  we  do  not 
require  q  to  possess  all  of  the  properties  of  a  probability  mass  function. 


us  assume  that  9(2/2)  >  9(2/1),  but  r(t/i,2i)  >  r{y2,z\)  and 
r(yi,z2)  >  r (2/2,22)-  Suppose  you  purchase  a  $1  gamble 
Y  =  2/2,  and  deem  a  fair  purchase  price  to  be  9(2/2);  i.e., 
you  pay  $9(2/2)  for  the  gamble  to  win  $1.  Now  also  suppose 
you  sell  the  gamble  (2/2,21)  V  (2/2,22).  By  additivity  of 
beliefs,  a  fair  selling  price  for  this  bet  would  be  r[(2/2,  21)  V 
(2/2,22)]  =  f (2/2, 2i)  +  f (2/2, 22).  However,  according  to 
the  above  ordering,  you  must  have  9(2/2)  >  \  and,  since 
r (2/2, 21)  +  r(y2,  z2)  <  r{yi,z{)  +  r(yi,  z2),  it  follows  that 
r [(2/2, 21)  V (2/2,  22)]  <  \ ■  After  all  gambles  have  been  bought 
and  sold,  your  net  wealth  isr[(y2,Zi)V{y2,z2)\  —  q(y2)  <  0. 
To  overcome  this  loss,  you  must  make  up  the  difference 
once  the  outcome  of  the  gamble  is  known.  But  if  neither 
2/2  nor  (2/2,21)  V  (2/2,22)  occur,  you  win  nothing  and  you 
pay  nothing;  if  (2/2,21)  V  (2/2,22)  occurs,  then,  of  course,  2/2 
occurs,  so  you  win  $1  which  you  must  pay  to  the  buyer  of 
your  gamble.  Thus,  once  the  gambles  have  been  bought  and 
sold,  your  net  wealth  is  invariant  to  whatever  happens — you 
suffer  a  sure  loss. 

A  belief  system  is  said  to  be  coherent  if  it  is  not  possible 
to  construct  a  Dutch  book.  The  Dutch  Book  theorem  [8, 9] 
and  its  converse  [10]  state  that  a  belief  system  is  coherent 
if  and  only  if  it  complies  with  a  probability  measure  that 
describes  the  degrees  of  belief  regarding  the  propositions 
under  consideration.  The  above  example  does  not  comply 
with  the  laws  of  probability  theory,  since  9(2/2)  7^  r (2/2, 21  )  + 
r (2/2, 22);  i.e.,  marginalization  fails. 

Mathematically,  a  condition  of  categorical  subjugation  cor¬ 
responds  to  a  condition  of  sure  loss.  Therefore,  to  eliminate 
the  possibility  of  categorical  subjugation,  the  utilities  must 
possess  the  same  syntax  as  probability  mass  functions.  We 
formalize  this  result  as  follows. 

Let  X  =  { X-\ . . . . ,  Xn  }  be  a  group  of  distributed  decision¬ 
makers  whose  influence  relationships  can  be  expressed  with 
a  directed  acyclic  graph.  For  each  Xt,  let  pa  ( X, )  = 
{Xir, . . . ,  XiPi }  denote  the  p,  parents  of  Xt,  and  let  At  = 
Atl  x  •  •  •  x  Alp  denote  the  pi  -dimensional  product  of  the 
action  spaces  corresponding  to  the  parents  of  Xt.  If  X,  has 
no  parents,  then  Ai  =  0.  For  each  Xi,  uX|  Pa(x4)(<J*|ai) 
is  the  utility  that  Xt  ascribes  to  a,,  conditioned  on  X.L) 
committing  to  a4 . ,  j  =  1 , ,piw  If  Xt  has  no  parents,  the 
conditional  utility  is  the  marginal  utility;  i.e.,  ux.\  pa  (Xi)  = 
ux.  if  pa(Xj)  =  0. 

Theorem  3.1:  Categorical  subjugation  cannot  occur  if  and 
only  if  the  utilities  uXil  pa  (Xi)  are  conditional  mass  functions 
defined  over  At  x  A,;  i.e.,  Ux4|P»(x4)(a*|ai)  >  OVai  G  At 
and  HZaiGAi  Uxi  1  p* (*i)(ai I a')  =  1  Va'  G  A,).  Furthermore, 
the  group-level  utility  of  (a±, . . .  ,an)  for  the  group  X  = 
{*1,  •  *  *  ,  N 

n 

tix(a)  =  uXl...Xn(ai,  ...an)  =  «x4i  Pa(xi)(al|ai),  (2) 

i=l 

where  a,  is  the  sub-profile  of  a  corresponding  to  pa  (X,). 

Proof:  The  Dutch  Book  Theorem  and  its  converse  estab¬ 
lish  that  the  utilities  must  be  mass  functions.  Consequently, 
all  of  the  edges  of  the  DAG  are  utilities  that  possess  the  math¬ 
ematical  structure  of  conditional  probability  mass  functions 


(albeit  with  different  semantics).  Furthermore,  the  categorical 
utilities  of  each  root  vertex  of  the  DAG  possess  the  math¬ 
ematical  structure  of  marginal  probability  mass  functions. 
Thus,  the  DAG  satisfies  all  of  the  conditions  of  a  Bayesian 
network,  and  we  may  apply  the  fundamental  theorem  of 
Bayesian  networks;  namely,  that  the  joint  probability  mass 
function  of  the  random  variables  associated  with  the  vertices 
is  the  product  of  the  conditional  probability  mass  functions 
of  all  vertices  with  parents,  and  the  marginal  mass  functions 
of  all  root  vertices  [1-3],  q 

The  content  of  this  theorem  is  that  the  mathematics  of 
probability  theory,  which  traditionally  applies  to  episte¬ 
mological  situations  involving  assessments  of  belief  and 
knowledge,  also  applies  to  praxeological  situations  involving 
assessments  of  expediency  and  efficiency.  This  result  means 
that  the  mathematical  notions  of  probability  theory,  such  as 
independence,  conditioning,  marginalization,  and  so  forth, 
can  be  given  praxeological,  as  well  as  epistemological, 
interpretations. 

Once  the  joint  utility  has  been  formed  by  the  aggregation 
of  individual  utilities,  the  group-optimal  action  profile  that 
maximizes  the  group  utility  is 

n 

aG  =  argmaxTT  ux  ipa(x  -,(^1^).  (3) 

2=1 

IV.  Individually  Rational  Solutions 

Although  the  social  choice-theoretic  approach  presented 
above  possesses  a  weak  notion  of  acceptability  for  the 
individuals  (coherence),  that  does  not  imply  that  the  group 
solution  is  acceptable  to  any  given  individual  in  terms  of 
benefit  to  it.  Simply  having  the  opportunity  for  one’s  interests 
to  be  equitably  considered  by  the  group  does  not  imply 
that  one’s  interests  are  adequately  represented  in  the  group 
decision. 

The  most  well  known  solution  concept  of  non-cooperative 
game  theory  is  Nash  equilibria  [11].  This  solution  concept 
is  a  reasonable  approach  under  competitive  scenarios,  but 
when  the  agents  are  ostensibly  to  cooperate,  it  can  lead 
to  overly  pessimistic  results.  For  example,  for  scenarios 
where  attempting  to  cooperate  leaves  one  vulnerable  to 
exploitation,  such  as  Prisoner’s  Dilemma-type  games,  the 
Nash  equilibrium  leads  to  the  next-worst  solution,  rather  than 
the  Pareto  solution.  Particularly  when  the  agents  are  disposed 
to  communicate  with  each  other,  a  more  appropriate  solution 
concept  is  one  that  permits  some  notion  of  equity  or  fairness 
to  guide  the  decisions. 

Cooperative  game  theory  differs  from  non-cooperative 
theory  in  that  players  may  enter  into  binding  agreements 
regarding  their  behavior.  For  the  players  to  forge  an  agree¬ 
ment,  however,  each  must  achieve  an  acceptable  degree  of 
satisfaction.  A  bargaining  game  is  a  cooperative  game  in 
which  each  participant  possesses  a  disagreement  point  that 
defines  the  benefit  that  is  guaranteed  to  accrue  to  it  if  a 
compromise  cannot  be  reached.  The  disagreement  point, 
therefore,  is  an  indication  of  the  strategic  strength  that  is 
conferred  on  the  participant  as  it  participates  in  negotiations: 


the  higher  the  disagreement  point,  the  greater  bargaining 
strength  of  the  participant. 

A  well-known  bargaining  solution  concept  that  offers  a 
clear  definition  of  individual  acceptability  is  the  Nash  bar¬ 
gain  [12],  which  permits  each  participant  to  make  maximal 
use  of  its  strategic  strength.  The  approach  is  based  on 
four  fundamental  principles:  (i)  invariance  to  positive  affine 
transformations;  (ii)  Pareto  optimality;  (iii)  independence  of 
irrelevant  alternatives,  and  (iv)  symmetry,  which  is  the  notion 
that  no  individual  agent  can  expect  that  the  other  agents 
will  grant  it  better  terms  than  that  individual  itself  would 
be  willing  to  grant,  were  roles  reversed. 

Nash  showed  that  these  four  conditions  lead  to  a  unique 
solution.  Let  dx.  denote  the  disagreement  point  for  X,.  The 
negotiation  set ,  denoted  A f,  is  the  subset  of  action  profiles 
such  that  every  participant  achieves  at  least  its  disagreement 
point.  Although  Nash’s  theory  pertains  to  categorical  utili¬ 
ties,  we  may  adapt  the  concept  to  the  conditional  case  by 
replacing  categorical  utilities  with  conditional  utilities.  The 
negotiation  set  is  defined  as 

A f  =  {a  €  *4.'.  uXi |  pa (x4)(cii | St)  L  dXi ,  i  =  1, . . . ,  n}. 

(4) 

Following  in  the  spirit  of  Nash’s  result,  the  bargaining 
solution  is 

n 

a N  =  argmaxTT  \ux  { pap^a*  |aj)  -  dx  1.  (5) 

aeA I  1 1 

i=l 

We  note  that  this  solution  is  not,  strictly  speaking,  a  Nash 
bargain,  since  the  utilities  are  not  categorical.  Nevertheless, 
the  solution  still  possess  the  key  feature  of  the  Nash  bargain; 
namely,  that  each  participant  takes  full  advantage  of  its 
strategic  strength  in  that  all  action  profiles  that  do  not 
achieve  at  least  its  disagreement  point  are  excluded  from 
consideration. 

V.  Reconciling  Group  and  Individual  Choices 

The  above  discussion  demonstrates  that,  for  groups  whose 
social  relationships  can  be  represented  by  a  directed  acyclic 
graph,  the  bargaining  solution  and  the  coherent  group-level 
optimal  solution  possess  similar  structure,  differing  mainly 
in  the  introduction  of  disagreement  point  for  the  bargaining 
solution  concept.  With  the  bargaining  approach,  the  disagree¬ 
ment  point  is  the  value  that  the  decision  maker  can  guarantee 
for  itself,  regardless  of  whether  or  not  a  compromise  can 
be  reached.  The  justification  for  this  approach  is  that  it  is 
possible  for  the  agent  to  walk  away  from  negotiations  and 
go  its  own  way  without  regard  for  others.  While  a  go-it- 
alone  option  may  be  possible  for  human  decision  makers, 
an  automated  system  that  will  not  cooperate  is  likely  to 
be  dysfunctional.  In  fact,  it  may  be  necessary  that  they 
reach  a  compromise  solution,  regardless  of  the  individual 
costs.  If  such  a  situation  obtains,  then  the  disagreement  point 
for  each  agent  will  be  its  zero  level,  in  which  case  the 
bargaining  solution  will  coincide  with  the  optimal  solution 
for  the  group.  This  is  an  interesting  result.  Why  should  the 
best  individual  solution  in  the  sense  of  a  fair  compromise 


for  each  individual,  as  expressed  by  (5),  also  result  in  the 
best  group-level  solution,  as  expressed  by  (3)?  The  fact 
that  there  is  such  a  close  correspondence  suggests  that  the 
notions  of  symmetry  (no  individual  can  expect  that  others 
will  grant  it  better  terms  than  it  would  be  willing  to  grant, 
were  roles  reversed)  and  coherence  (no  individual  interests 
can  be  categorically  subjugated  to  the  interests  of  the  group) 
are  operationally  equivalent. 

Theorem  5.1:  For  groups  whose  social  relationships  can 
be  represented  by  a  directed  acyclic  graph,  (a)  coherence 
implies  symmetry,  and  (b)  if  symmetry  applies  and  one 
individual  is  categorically  subjugated,  then  all  individuals  are 
categorically  subjugated — a  condition  of  mutual  categorical 
subjugation. 

Proof:  If  coherence  holds,  then  the  utilities  may  be 
aggregated  as  the  product  of  the  conditional  and  marginal 
utilities,  as  given  by  (2),  which,  by  changing  the  zero  level, 
yields  the  bargain  structure 

n 

\UXi\  pa(Xj)  (fli  |ai)  —  dxj]  •  (6) 

i= 1 

Since  the  labeling  of  agents  is  arbitrary,  exchanging  Xi:  and 
Xj  leaves  this  structure  unaffected,  hence  symmetry  holds. 

Now  suppose  Xt  can  be  categorically  subjugated.  By 
symmetry,  if  the  roles  of  Xi  and  Xj,  j  f  i,  are  exchanged 
(including  the  utilities),  then  the  solution  is  unchanged. 
Thus,  Xj  must  be  categorically  subjugated  as  well.  Since 
j  is  arbitrary,  this  means  that  all  players  are  categorically 
subjugated.  q 

Mutual  categorical  subjugation  is  a  pathological  condition.  It 
says  that  all  agents  must  always  sacrifice  their  own  welfare 
to  benefit  the  group.  In  fact,  even  if  all  individuals  could 
agree  that  a  given  action  profile  were  simultaneously  best  for 
all  of  them,  that  profile  would  not  be  best  for  the  group — 
a  violation  of  the  Pareto  principle.  More  generally,  such  a 
situation  would  mean  that  the  interests  of  the  individuals  have 
only  partial  influence,  at  best  (and  perhaps  no  influence), 
on  the  interests  of  the  group.  Such  a  pathological  situation 
would  violate  the  most  fundamental  premise  of  social  choice 
theory:  “Democratic  theory  is  based  on  the  premise  that  the 
resolution  of  a  matter  of  social  policy,  group  choice  or  col¬ 
lective  action  should  be  based  on  the  desires  or  preferences 
of  the  individuals  in  the  society,  group,  or  collective”  [5, 
P-  3], 

VI.  Multiagent  Decision  Making  Under  Risk 

A  decision  is  made  under  risk  if  the  utility  of  actions 
is  dependent  upon  random  phenomena.  When  decisions 
are  made  under  risk,  the  classical  approach  is  a  two-step 
procedure.  First,  the  utilities  are  defined  to  correspond  with 
the  decision  makers’  preferences;  next,  the  expected  value 
of  the  utility  is  computed.  However,  since  Theorem  3.1 
establishes  that  coherent  utilities  must  possess  the  mathemat¬ 
ical  syntax  of  probability  mass  functions,  the  praxeological 
and  epistemological  aspects  of  a  decision  problem  may  be 
merged  into  a  single  praxi-epistemic  structure.  In  particular, 
we  may  view  the  decision-making  elements  and  the  random 


elements  as  vertices  of  a  praxeic-epistemic  network ,  whose 
edges  are  conditional  mass  functions. 

Let  9  =  {  0  \ . . . . .  9m  }  denote  m  random  variables  over  the 
product  sample  space  0  =  @i  x  •  •  •  x  0m  associated  with 
the  decision  problem  which,  when  merged  with  X,  forms  an 
(n+ to) -dimensional  DAG,  called  a  praxi-epistemic  network. 
Let  Di  denote  the  realization  of  9i,  and  let  =  (i?i, . . . ,  1 9m). 
Then  the  joint  praxi-epistemic  utility  is 

n  m 

Wxe(a,r?)  =  ]^[  «xi|pa(xi)(ai|ai,i9i)  f]p^|Pa(^)(^jaj,i9j). 

i= 1  j= 1 

(7) 

where  pe.  |  pa  (e. }  is  the  conditional  probability  of  9j  given  its 
parents,  a ;  and  )9,  correspond  to  the  praxeic  and  epistemic 
parents,  respectively,  of  X,.  and  a;  and  rlj  correspond  to 
the  praxeic  and  epistemic  parents,  respectively,  of  0, .  The 
expected  utility  then  becomes  the  praxeic  marginal 

tix(a)  =  ^2  Mxe( a,tf).  (8) 

Theorem  3.1  establishes  that  maximizing  ux(x)  yields  the 
optimal  joint  action;  i.e., 

aG  =  argmax«.x(a).  (9) 

Furthermore,  under  the  assumption  that  no  action  can  be 
taken  unless  all  decision  makers  can  agree  on  a  joint  action, 
the  disagreement  point  for  each  decision  maker  is  zero,  and 
(8)  constitutes  the  bargaining  solution  for  the  group.  Thus, 
the  group  decision  can  also  be  viewed  as  individually  optimal 
in  the  sense  that  each  takes  full  advantage  of  its  strategic 
strength. 

In  addition,  this  solution  also  satisfies  the  coherency 
property,  as  established  by  the  following  corollary. 

Corollary  6.1:  Let  the  marginal  expected  utility  of  X ,  be 
given  by 

uXi(a,i)  =  ^2  «x(ai, . . .  ,a„)  i  =  1, . . .  ,n,  (10) 

— 'O-i 

where  is  the  so-called  “not-sum”  notation  meaning  that 
the  sum  is  taken  over  all  elements  of  (ai, . . . ,  an)  except  a.;. 

If  ux.  (a-i)  >  ux.  (a'),  then  there  exists  (a*, . . . ,  a*_1, 
a*+i,  •••,<)  such  *at 

UxK, . . . ,  a*_l5  ai,  a*+1, . . .  a*n)  > 

•  •  •  j  ai-\i  aii  ai+ 1;  •  •  ■  an). 

Proof:  Suppose  ux  .(a,)  >  uXi  (a')  holds,  but  there  is  no 
such  K , . . . ,  a*_1,a*+1, ...,  a*).  Then 

uXi(ai )  =  ^  wx(ai, . . .  ,oi; . . .  ,an)  < 

—i  ai 

^  ux(ai, . . . ,  a'i, . . . ,  an)  =  ux.(aj),  (11) 

-'a'i 

a  contradiction.  |-| 

Example  6.1:  Figure  2  displays  a  praxi-epistemic  network 
corresponding  to  the  hierarchal  manufacturing  scenario  with 
the  network  introduced  in  Example  2.1.  The  root  node  of 


this  DAG  is  X\,  the  agent  who  decides  which  market  sector 
to  target,  and  is  given  by 

uXl  (ai)  =  0.6  uXl  (a2)  =  0.4. 

This  example  also  includes  a  random  component,  9,  that 
characterizes  the  economic  environment  of  the  market  sector. 
Let  us  take  0  =  {$,  $'},  where  i?  corresponds  to  a  growing 
economy  and  -d'  corresponds  to  a  shrinking  economy.  Thus, 
the  probability  of  the  economic  status  is  conditioned  on 
the  market  sector.  The  corresponding  conditional  probability 
functions  are 


Pe |x1(t?|a2)  =  0.5  Pe\Xl  (^,|°2)  =  0.5 

Pe ix1(t?|a2)  =  0.6  Pe\Xl('l3  |a2)  =  0-4. 

The  utility  of  the  product  to  be  manufactured  depends  upon 
the  market  sector  and  the  economic  state,  and  is  given  as 


Px2\Xle(a2\a1,'d)  =  0.7 
Px2\Xle{a2\a'1,'d)  =  0.5 
Px2|X1s(®2|®1)  t?  )  =  0.4 
PX 2|X16l(fl2|fllJ  =  0.2 


Px2\x1e(d'2\0'li  t?)  —  0.3 
Px2  |x16i(a2lal)  =  0.5 
pX2\Xle{a'2\a\,i}')  =  0.6 
Px2\Xle{,a2\a\i  $  )  =  0.8 


Finally,  the  utility  of  the  grade  of  materials  used  in  the 
manufacture  is  conditioned  on  the  product  and  the  sector 


is  given  by 

Px3ix1x2(a3|ai,a2)  =  0.6 

Px3ix1x2(a3|a1,a2)  =  0.6 
Px3\Xlx2  (o3  lai>  °2)  =  0.3 
Px3|x1x2(a3|ai,a2)  =  0-2 

The  praxi-epistemic  utility 


Px3  |x1x2(Q'3|dija2)  =  0.4 
f,x3|x1x2(a3|ai,a2)  =  0-4 
Px3|x1x2(a3|oi,a2)  =  0.7 
Px3|x1x2(a3|oi,a2)  =  0-8 


uXlX2x3e(ai,a2,a3,'d)  =  uXl(ai)uX2lXie(a2\a1,'d) 

«X3 |Xjx2  (a3|ai,  a2)pe[Xl  (t?|ai).  (12) 


Fig.  2.  The  praxi-epistemic  network  for  the  hierarchy  scenario. 

The  expected  utility  is  the  praxeic  marginal 

Mx1x2x3(ai,a2,a3)  =  ^  “x1x2x3«(ai.  °2,  a3.  t?).  (13) 

flee 

Straightforward  calculations  using  the  above  utility  values  in¬ 
dicates  that  the  optimal  solution  for  the  group  is  (ai,  ai,  a3), 
with  a  global  utility  value  of  uXlX2X3  (at,  a2,  a3)  =  0.216. 
Upon  computing  the  marginals,  we  obtain 

uXl(ai)  =  0.6 

uX2{a2)  =  0.584 

«x3(a3)  =  0.4624, 

indicating  that  the  best  group  solution  is  also  best  for  X\ 
and  X2,  but  is  worst  for  X:i. 


VII.  Conclusion 

This  paper  provides  a  new  theoretical  approach  to  the 
modeling  of  distributed  autonomous  decision  makers.  Po¬ 
tential  applications  include  mobile  unmanned  robotic  sys¬ 
tems  such  as  coordinated  UAV  surveillance  and  reconnais¬ 
sance  missions,  distributed  decision  making,  scheduling  and 
coordination  for  manufacturing  enterprise  automation,  and 
man/machine  decision  making  scenarios. 

Conventional  categorical  preference  orderings  are  not  de¬ 
signed  to  account  for  sophisticated  social  relationships  such 
as  compromise  and  negotiation,  since  they  do  not  easily 
permit  individuals  to  expand  their  spheres  of  interest  to 
account  for  the  preferences  of  others.  The  introduction  of 
conditional  utilities  is  an  important  contribution  to  the  theory 
of  multiagent  decision  making,  since  it  permits  each  agent 
to  express  its  preferences  as  a  function  of  the  preferences 
of  others.  Individual  conditional  individual  utilities  can  be 
aggregated  to  form  a  group  utility  that  incorporates  the 
social  relationships  that  exist  among  the  individuals,  thereby 
providing  a  complete  model  of  the  community  of  decision 
makers. 

A  second  contribution  is  the  notion  of  coherence  and  the 
introduction  of  a  mathematical  structure  for  the  utilities  that 
ensures  that  no  agent  can  be  categorically  subjugated.  This 
structure  permits  the  social  relationships  between  individuals 
to  be  represented  by  a  directed  acyclic  graph  whose  edges 
are  conditional  mass  functions  —  a  Bayesian  network.  This 
new  syntax  provides  a  natural  vehicle  with  which  to  model 
sophisticated  social  relationships  such  as  altruism. 

A  third  contribution  is  the  merging  of  the  praxeic  and 
epistemic  components  of  a  decision  problem  into  a  single 
praxi-epistemic  utility  that  accounts  for  both  utility  and  risk. 
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Abstract  The  design  of  robotic  systems  that  are  capable 
of  sophisticated  social  behavior  such  as  cooperation,  com¬ 
promise,  negotiation,  and  altruism,  requires  more  complex 
mathematical  models  than  is  afforded  by  the  classical  mech¬ 
anisms  for  making  value  judgments  and  decisions.  A  new 
concept  of  multi-agent  satisficing,  defined  in  terms  of  rel¬ 
ative  effectiveness  and  efficiency,  is  an  alternative  to  clas¬ 
sical  optimization-based  decision  making.  Conditional  utili¬ 
ties,  which  take  into  account  the  interests  of  others  as  well  as 
the  self,  represent  an  alternative  to  the  categorical  utilities  of 
classical  decision  theory.  A  multi-agent  utility  aggregation 
structure  is  developed  that  avoids  the  sure  subjugation  of  the 
interests  of  any  individual  to  the  interests  of  the  group.  By 
expressing  a  society  as  a  directed  acyclic  graph,  Bayesian 
network  theory  is  applied  to  artificial  societies.  A  satisficing 
social  welfare  function  accounts  for  the  influence  relation¬ 
ships  among  decision-making  agents. 

Keywords  Multi-agent  Systems,  Game  Theory,  Social 
Choice  Theory,  Satisficing,  Conditional  Preferences, 
Coherence 


1  Introduction 

Multi-stakeholder  decision  problems  arise  in  many  contexts, 
including  social  choice  theory,  game  theory,  distributed  con¬ 
trol  theory,  multi-criterion/multi-objective  decision  theory, 
multi-agent  systems  theory,  and  social  robotics.  Although 
the  particulars  of  these  various  contexts  can  differ  widely,  to 
be  rational,  each  must  possess  two  fundamental  attributes: 
(a)  an  ability  to  make  value  judgments  regarding  alterna- 
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tives,  and  (b)  procedures  for  using  value  judgments  to  make 
choices. 

The  field  of  social  robotics,  in  particular,  provides  a  rich 
environment  for  the  application  of  decision-making  logics 
that  are  able  to  accommodate  sophisticated  social  behaviors 
such  as  compromise,  negotiation,  and  altruism.  Whether  a 
social  robot  interfaces  with  humans,  other  robots,  or  both,  it 
typically  resides  in  a  community  that  involves  some  notion 
of  coordination  (which  may  be  either  cooperative  or  compet¬ 
itive).  In  such  an  environment,  value  judgments  can  depend 
upon  the  desires  and  preferences  of  others,  and  procedures 
for  making  choices  must  take  these  complex  social  relation¬ 
ships  into  account. 

This  paper  provides  a  mathematical  framework  within 
which  to  design  and  synthesize  complex  decision-making 
collectives  that  are  able  to  accommodate  socially  complex 
decision  making.  Section  2  provides  a  brief  history  of  classi¬ 
cal  multi-agent  decision  making  and  motivates  our  approach. 
Section  3  introduces  the  key  components  of  the  framework 
we  are  proposing.  Section  4  introduces  new  concepts  of  so¬ 
cial  welfare.  Section  5  reconciles  our  theory  with  classical 
approaches.  Section  6  describes  a  special  case  of  what  we 
term  decoupled  social  systems,  and  Section  7  offers  conclu¬ 
sions. 


2  Background 

Cooperative  robotics  is  an  active  research  area.  Of  particu¬ 
lar  interest  is  the  development  of  theories  for  decentralized 
control  of  multi-robot  societies.  Swarm-based  approaches 
have  demonstrated  the  emergence  of  cooperative  behavior 
[24,25].  Potential  functions,  consisting  of  constraints  and 
goals  that  are  imposed  upon  the  system,  have  been  used  to 
address  the  mobile  robot  navigation  problem  [5].  Shannon 
information  theory  has  been  applied  to  the  investigation  of 
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diversity  among  heterogeneous  agents,  thereby  enabling  an 
assessment  of  the  ability  of  the  system  to  perform  coopera¬ 
tively  [4].  Behavior-based  approaches  have  been  applied  to 
the  design  of  cooperative  robotic  teams,  stressing  minimal¬ 
ism,  statelessness,  and  tolerance  [53].  The  variety  displayed 
by  these  various  of  approaches  is  a  strong  indication  of  the 
complexity  involved  in  the  design  of  cooperative  multiagent 
systems,  and  is  in  indication  that  there  is  no  single  approach 
that  can  be  universally  applied  to  the  design  and  synthesis 
of  such  systems. 

Because  of  the  complexity  of  multiagent  systems,  it  is 
important  to  review  the  fundamental  principles  that  are  ex¬ 
ploited,  either  implicitly  or  explicitly,  in  their  design.  Ac¬ 
cordingly,  we  provide  a  brief  review  of  classical  decision- 
theoretic  foundations  and  a  discussion  of  rationality. 


2.1  Classical  Decision-theoretic  Foundations 

The  multi-stakeholder  decision  problem  originated  in  the  so¬ 
cial  sciences  context,  with  foundations  laid  by  Bergson  [6], 
Samuelson  [39],  Arrow  [1,2],  and  others,  who  assert  that 
individual  values  are  the  fundamental  elements  of  a  society. 
Arrow  has  provided  what  is  perhaps  the  most  clear  defini¬ 
tion  of  this  concept:  “It  is  assumed  that  each  individual  in 
the  community  has  a  definite  ordering  of  all  conceivable  so¬ 
cial  states,  in  terms  of  their  desirability  to  him  ...  It  is  sim¬ 
ply  assumed  that  the  individual  orders  all  social  states  by 
whatever  standard  he  deems  relevant”  [1,  p.  17].  Further¬ 
more,  Friedman  argues  that  the  process  by  which  these  pref¬ 
erences  are  obtained  is  irrelevant:  “The  economist  has  little 
to  say  about  the  formation  of  wants;  this  is  the  province  of 
the  psychologist.  The  economist’s  task  is  to  trace  the  con¬ 
sequences  of  any  given  set  of  wants.  The  legitimacy  of  any 
justification  for  this  abstraction  must  rest  ultimately,  in  this 
case  as  with  any  other  abstraction,  on  the  light  that  is  shed 
and  the  power  to  predict  that  is  yielded  by  the  abstraction” 
[17,  p.  13].  According  to  the  Arrow/Friedman  model,  each 
participant  in  a  multi-agent  decision  problem  comes  to  the 
decision-making  activity  with  pre-defined  preference  order¬ 
ings,  the  origins  of  which  are  not  germane  to  the  decision 
problem.  Such  preference  orderings  are  categorical.  The  as¬ 
sumption  that  each  individual  possesses  a  categorical  pref¬ 
erence  ordering  has  been  adopted  almost  universally  in  clas¬ 
sical  multi-stakeholder  decision-making  contexts. 

The  most  common  procedure  for  using  value  judgments 
to  make  choices  is  to  invoke  some  notion  of  optimization 
—  the  sine  qua  non  of  classical  decision  theory.  As  put 
by  Euler,  “Since  the  fabric  of  the  world  is  the  most  perfect 
and  was  established  by  the  wisest  Creator,  nothing  happens 
in  this  world  in  which  some  reason  of  maximum  or  min¬ 
imum  would  not  come  to  light”  (quoted  in  [35]).  What  is 
optimal  in  multi-stakeholder  settings,  however,  can  depend 


upon  the  point  of  view.  In  the  classical  game-theoretic  con¬ 
text,  each  individual  seeks  to  optimize  value  to  itself,  and  a 
Nash  equilibrium  is  a  constrained  mutually  optimal  solution 
for  all  players  in  the  sense  that  no  individual  can  unilaterally 
improve  its  welfare  by  changing  its  decision.  On  the  other 
hand,  in  the  social  choice  context,  it  is  the  “organization  in¬ 
carnate,”  as  Raiffa  put  it  [37],  who  seeks  to  maximize  value 
for  the  group  considered  as  a  whole.  In  the  former  case,  the 
value  of  the  individual  decisions  to  the  group  is  not  explicitly 
considered,  and  in  the  latter  case,  although  the  value  judg¬ 
ments  of  the  individuals  are  used  to  define  group-level  deci¬ 
sions  (e.g.,  a  weighted  sum  of  individual  valuations),  there 
is  no  assurance  that  the  resulting  decision  will  maximize  the 
value  to  any  individual.  In  fact,  the  decision  that  is  best  for 
the  group  can  be  extremely  unfavorable  to  some  members 
of  the  group. 

2.2  Rationality 

The  classical  approach  to  decision  making  in  group  settings 
is  the  doctrine  of  individual  rationality:  the  notion  that  each 
individual  should  act  in  a  way  that  maximizes  its  own  satis¬ 
faction  (without  explicit  regard  for  the  satisfaction  of  oth¬ 
ers).  This  doctrine  enjoys  a  central  role  in  classical  deci¬ 
sion  theory  and  game  theory.  As  discussed  by  Tversky  and 
Kahneman,  “The  assumption  of  [individual]  rationality  has 
a  favored  position  in  economics.  It  is  accorded  all  of  the 
methodological  privileges  of  a  self-evident  truth,  a  reason¬ 
able  idealization,  a  tautology,  and  a  null  hypothesis.  Each 
of  these  interpretations  either  puts  the  hypothesis  of  ratio¬ 
nal  action  beyond  question  or  places  the  burden  of  proof 
squarely  on  any  alternative  analysis  of  belief  and  choice. 
The  advantage  of  the  rational  model  is  compounded  because 
no  other  theory  of  judgment  and  decision  can  ever  match  it 
in  scope,  power,  and  simplicity”  [52,  p.89]. 

The  uncritical  application  of  individual  rationality  as  a 
model  for  decision  making  in  multi-agent  contexts  can  be 
problematic.  Arrow  has  observed  that  “rationality  in  appli¬ 
cation  is  not  merely  a  property  of  the  individual.  Its  useful 
and  powerful  implications  derive  from  the  conjunction  of 
individual  rationality  and  other  basic  concepts  of  neoclassi¬ 
cal  theory  —  equilibrium,  competition,  and  completeness  of 
markets. . .  When  these  assumptions  fail,  the  very  concept  of 
rationality  becomes  threatened,  because  perceptions  of  oth¬ 
ers  and,  in  particular,  their  rationality  become  part  of  one’s 
own  rationality”  [3,  p.  203], 

If  all  agents  are  indeed  focused  on,  and  only  on,  their 
narrow  self-interest,  then  categorical  preferences  are  appro¬ 
priate.  Difficulties  arise,  however,  when  the  sphere  of  con¬ 
cern  of  an  individual  extends  beyond  its  own  narrow  self- 
interest.  The  only  way  such  an  individual  can  use  categorical 
preferences  to  accommodate  the  preferences  of  other  indi¬ 
viduals  is  to  redefine  its  values  by  substituting  (at  least  par- 
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tially)  the  values  of  the  others  for  its  own.  Such  behavior  is  a 
manifestation  of  categorical  altruism,  i.e.,  irrevocably  sac¬ 
rificing  one’s  own  welfare  in  an  attempt  to  benefit  another, 
thus  fundamentally  changing  the  nature  of  the  association. 

Considerable  research,  notably  in  the  field  of  behavorial 
economics  [8],  has  addressed  the  need  for  agents  to  define 
their  preferences  such  that  they  consider  social  interactions. 
Fehr  and  Schmidt  [15]  discuss  how  individual  preference  or¬ 
derings  may  be  modified  to  take  into  account  concepts  such 
as  fairness  and  cooperation  by  introducing  a  notion  of  in¬ 
equity  aversion.  To  account  for  this  attribute,  they  include, 
in  addition  to  a  purely  selfish  component,  an  inequity  aver¬ 
sion  component  in  their  utility.  Consequently,  they  rely  upon 
(re-defined)  categorical  preference  orderings  to  model  social 
interactions.  All  that  changes  is  the  definition  of  the  indi¬ 
vidual’s  self-interest.  This  approach,  however,  has  serious 
limitations,  as  acknowledged  by  Sen:  “It  is  possible  to  de¬ 
fine  a  person’s  interests  in  such  a  way  that  no  matter  what 
he  does  he  can  be  seen  to  be  furthering  his  own  interests  in 
every  isolated  act  of  choice  ...  no  matter  whether  you  are  a 
single-minded  egoist  or  a  raving  altruist  or  a  class-conscious 
militant,  you  will  appear  to  be  maximizing  your  own  utility 
in  this  enchanted  world  of  definitions”  [40,  page  19].  Cate¬ 
gorical  altruism  simulates  cooperation,  compromise,  and  al¬ 
truism  with  a  regime  that  is  explicitly  designed  to  character¬ 
ize  selfishness,  competition,  and  avarice,  and  does  not  offer 
a  natural  and  intuitively  pleasing  framework  within  which 
to  express  sophisticated  and  complex  social  relationships. 
While  such  constructions  may  serve  to  explain  some  forms 
of  human  behavior,  it  is  difficult  to  see  how  they  can  be  used 
systematically  to  synthesize  complex  relationships  between 
artificial  agents. 

The  foundational  assumptions  of  categorical  preferences 
for  each  individual  and  optimization  (either  for  individuals 
or  for  the  group)  undergird  virtually  all  of  classical  formal¬ 
ized  decision  theory  in  both  individual  and  group  settings. 
These  assumptions  correspond  to  analysis  tools  that  serve, 
with  varying  success,  to  explain  and  predict  human  behav¬ 
ior,  but  they  are  not  causal:  they  do  not  govern  human  behav¬ 
ior.  On  the  other  hand,  models  that  are  used  to  design  a  sys¬ 
tem  of  artificial  autonomous  decision-making  agents  must 
be  causal:  they  are  synthesis  tools  that  will  indeed  govern 
the  behavior  of  the  artificial  society. 

Many  social  science  researchers  argue,  however,  that  the 
classical  foundational  assumptions  do  not  provide  an  ade¬ 
quate  model  for  human  behavior  (e.g.,  see  [26,47]).  And 
if  their  adequacy  to  analyze  human  behavior  is  questioned, 
then  we  may  rightly  question  their  appropriateness  as  as¬ 
sumptions  with  which  to  synthesize  the  behavior  of  artifi¬ 
cial  societies  that  are  expected  to  behave  in  ways  that  are 
can  be  understood  and  trusted  by  humans.  As  Shubik  as  ac¬ 
knowledged,  “Economic  man,  operations  research  man  and 
the  game  theory  player  were  all  gross  simplifications.  They 


were  invented  for  conceptual  simplicity  and  computational 
convenience  in  models  loaded  with  implicit  or  explicit  as¬ 
sumptions  of  symmetry,  continuity,  and  fungibility  in  order 
to  allow  us  (especially  in  a  pre-computer  world)  to  utilize 
the  methods  of  calculus  and  analysis.  Reality  was  placed  on 
a  bed  of  Procrustes  to  enable  us  to  utilize  the  mathematical 
techniques  available  [43].” 

It  is  time  to  make  the  bed  fit  its  occupant.  Particularly  in 
the  context  of  artificial  multi-agent  system  design  and  syn¬ 
thesis,  a  framework  is  needed  to  model  explicitly  the  possi¬ 
bly  complex  value  judgments  the  may  exist  among  the  mem¬ 
bers  of  an  artificial  society.  It  is  time  to  account  for  situations 
where  the  conditions  under  which  preferences  are  formed 
are  relevant  and  cannot  be  summarily  ignored;  it  is  time  to 
accommodate  more  complex  and  flexible  criteria  for  mak¬ 
ing  decisions.  In  other  words,  it  is  time  to  move  beyond  cat¬ 
egorical  preference  orderings  and  optimization  as  the  foun¬ 
dational  components  of  multi-stakeholder  decision  making. 

3  A  Social  Framework  for  Cooperative  Decision 
Making 

A  social  welfare  function,  as  defined  by  Arrow,  is  “a  pro¬ 
cess  or  rule  which,  for  each  set  of  individual  orderings  R\, 
R2,  . . . ,  Rn  for  alternative  social  states  (one  ordering  for 
each  individual),  states  a  corresponding  social  ordering  of 
alternative  social  states  R”  [1,  p.  23].  Classically,  the  indi¬ 
vidual  orderings  (either  ordinal  or  cardinal),  are  categorical, 
in  that  they  account  only  for  the  interests  of  the  individuals. 
We  wish  to  expand  the  spheres  of  interest  of  the  individu¬ 
als  to  include  the  interests  of  others  as  itself.  However,  once 
we  move  beyond  restricting  to  individual  interests,  the  no¬ 
tion  of  optimization  becomes  problematic.  Optimization  is 
an  individual  concept:  for  a  group  to  optimize,  it  must  act 
as  a  single  unit,  capable  of  making  rational  judgments  and 
choices.  Such  a  structure  however,  is  not  consistent  with  our 
assumption  that  the  decision-makers  are  autonomous. 

Our  approach  is  to  replace  the  twin  assumptions  of  op¬ 
timization  and  categorical  preferences  with  two  alternative 
concepts:  satisficing  and  conditioning.  Our  goal  is  to  create 
a  satisficing  social  welfare  function  and  individual  welfare 
functions  that  can  be  used  to  construct  compromise  solu¬ 
tions  that  are  simultaneously  acceptable  to  the  group  and 
the  individuals,  thereby  removing,  or  at  least  reducing,  the 
wedge  that  separates  classical  concepts  of  group  and  indi¬ 
vidual  interests. 

3.1  Satisficing 

In  a  multi-agent  setting  it  is  not  generally  possible  to  maxi¬ 
mize  both  individual  and  group  preferences  simultaneously. 
A  potentially  more  socially  accommodating  concept  is  that 
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decisions  are  “good  enough.”  What  is  best  for  you  may  be 
different  from  what  is  best  for  me,  but  what  is  good  enough 
for  you  may  also  be  good  enough  for  me,  provided  we  have 
some  flexibility  in  our  respective  notions  of  what  it  means  to 
be  good  enough.  The  term  “satisficing”  has  been  advanced 
as  a  synonym  for  this  alternative  to  strict  optimization. 

The  first  usage  of  the  term  “satisficing”  in  a  decision- 
theoretic  context  is  attributed  to  Simon  [44-46],  who  ad¬ 
dressed  the  question  of  how  a  decision  maker  might  make 
a  choice  in  the  presence  of  informational  or  computational 
limitations.  Simon’s  approach  is  to  seek  an  optimal  choice, 
but  to  terminate  searching  and  once  the  decision  maker’s  as¬ 
piration  level  has  been  met.  Put  another  way,  to  satisfice  is 
to  accept  the  best  solution  so  far  obtained,  once  the  cost  of 
continuing  to  search  exceeds  the  expected  improvement  in 
value  were  the  search  to  continue.  Many  other  variations  of 
this  concept  have  appeared  in  the  literature  [7, 14, 20, 23, 28, 
29,33,36,41,50,51,54-56],  and  it  is  not  the  intent  of  this 
paper  to  review  them  in  detail.  Suffice  it  to  say,  however, 
that  all  of  these  approaches  view  satisficing  as  a  species  of 
bounded  rationality:  one  settles  for  a  solution  that  is  deemed 
to  be  “good  enough,”  but  which  is  not  necessarily,  and  usu¬ 
ally  not,  optimal  in  any  meaningful  sense.  Satisficing  a  la 
Simon  is  an  heuristic  approximation  to  the  ideal  of  being 
best  (and  is  only  constrained  from  achieving  this  ideal  by 
practical  limitations) 

The  concept  of  satisficing  developed  herein  differs  from 
the  afore-mentioned  notion  in  several  important  ways.  First, 
in  contrast  to  satisficing  as  advanced  by  Simon  and  others, 
it  is  not  heuristic;  rather,  it  provides  a  concept  of  satisficing 
that  is  as  mathematically  formalized  and  precise  as  is  the 
notion  of  optimization.  Second,  it  treats  the  notion  of  being 
good  enough  as  the  ideal  (rather  than  an  approximation)  — 
it  is  not  a  species  of  bounded  rationality.  Third,  it  extends  to 
the  multi-agent  case,  thereby  providing  a  natural  framework 
for  multi-agent  decision  making.  Fourth,  it  readily  accom¬ 
modates  the  extension  of  interests  beyond  the  self,  thereby 
accommodating  more  sophisticated  social  relationships  than 
self-interest  affords.  We  retain  the  term  “satisfice”  because, 
even  though  our  approach  is  not  heuristic,  we  nevertheless 
seek  solutions  that  are  good  enough,  with  the  essential  dif¬ 
ference  being  that  we  provide  a  non-heuristic  definition  of 
what  it  means  to  be  good  enough. 

Although  it  seems  eminently  reasonable  at  least  to  at¬ 
tempt  (given  sufficient  resources)  to  seek  an  optimal  deci¬ 
sion,  humans  often  invoke  a  systematic  approach  to  deci¬ 
sion  making  (even  in  single-agent  decision  problems)  that, 
while  still  based  on  quantitative  measures  of  performance, 
does  not  correspond  to  optimization.  In  the  vernacular,  the 
optimization  paradigm  corresponds  to  seeking  “the  best  and 
only  the  best”  solution.  Also  common,  however,  is  the  para¬ 
digm  of  “getting  your  money’s  worth.”  In  an  intuitively  pleas¬ 
ing  sense,  this  latter  notion  admits  an  interpretation  as  being 


good  enough,  and  it  is  this  concept  that  we  invoke  as  the 
satisficing  paradigm  that  we  develop  in  this  paper.  A  com¬ 
prehensive  introduction  to  this  perspective  can  be  found  in 
[49], 

Many  theorists  (e.g.,  [1, 13, 18, 27])  have  argued  that  it  is 
unwise  to  aggregate  conflicting  interests  into  a  single  pref¬ 
erence  ordering.  Some  have  asserted  that  in  a  social  set¬ 
ting  individuals  have  multiple  facets,  as  defined  by  Steed- 
man  and  Krause  [48],  who  maintain  that  an  agent,  although 
an  indivisible  unit,  nevertheless  is  capable  of  considering  its 
choices  from  different  points  of  view,  and  that  separate  utili¬ 
ties  may  be  defined  to  correspond  to  each  facet  of  an  individ¬ 
ual.  A  natural  way  to  classify  attributes  is  according  to  their 
effectiveness  and  efficiency.  Each  individual  may  be  viewed 
as  being  composed  of  two  facets:  the  selecting  facet,  which 
evaluates  actions  in  terms  of  effectiveness  toward  pursuing 
objectives  without  concern  for  efficiency,  and  the  rejecting 
facet,  who  evaluates  actions  in  terms  of  efficiency  with  re¬ 
spect  to  consuming  resources  without  concern  for  effective¬ 
ness.  We  shall  view  these  selecting  and  rejecting  facets  as 
the  “atoms”  of  the  society. 

When  formulating  a  problem  under  the  satisficing  frame¬ 
work,  it  is  essential  that  the  selecting  and  rejecting  criteria 
not  be  restatements  of  each  other.  The  selecting  criterion 
should  correspond  to  the  goals  of  the  problem,  and  the  re¬ 
jecting  criterion  should  correspond  to  the  consumption  of 
resources.  This  dual  utilities  approach  is  the  basis  for  our 
notion  of  satisficing. 

Under  the  optimization  paradigm,  all  of  the  performance 
measures  are  combined  into  a  single  utility,  whereas  un¬ 
der  the  satisficing  paradigm,  the  measures  of  effectiveness 
are  encoded  separately  from  the  measures  of  efficiency.  Un¬ 
der  the  optimization  paradigm,  the  alternatives  are  compared 
against  each  other  in  order  to  identify  the  globally  best  one. 
By  contrast,  under  the  satisficing  paradigm,  the  effectiveness 
and  efficiency  attributes  are  locally  compared  for  each  alter¬ 
native  separately,  and  all  alternatives  for  which  the  effective¬ 
ness  measures  exceed  the  efficiency  measures  are  consid¬ 
ered  to  be  satisficing.  Thus,  whereas  the  optimization  para¬ 
digm  is  designed  to  identify  a  single  best  alternative,  the  sat¬ 
isficing  paradigm  is  designed  to  identify  all  alternatives  that 
are  good  enough.  The  non-uniqueness  attribute  is  a  key  fea¬ 
ture  of  satisficing  in  a  multi-stakeholder  environment,  since 
it  is  amenable  to  flexibility  on  the  part  of  the  individuals  and 
of  the  group. 

To  introduce  the  formalism  of  satisficing,  let  us  first  con¬ 
sider  a  single  agent  X,  with  selecting  and  rejecting  facets 
denoted  S  and  R,  respectively,  and  let  us  denote  the  select¬ 
ing  utility,  or  selectability,  which  measures  the  progress  to¬ 
ward  the  goal  of  X,  and  u ,,  denotes  the  rejecting  inutility,  or 
rejectability  which  measures  the  consumption  of  resources 
such  as  cost,  exposure  to  hazard,  loss  of  social  reputation, 
and  so  forth. 
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Definition  1  Let  A  denote  the  set  of  actions  available  to  X. 
An  action  a  G  A  is  satisficing  if  us(a)  >  quR{a)  where  q  € 
[0, 1]  regulates  the  threshold  for  rejecting  elements  of  A  as 
not  satisficing.  (Nominally,  q  =  1,  but  as  we  shall  see,  q  can 
serve  as  a  measure  of  how  willing  an  agent  is  to  negotiate.) 
The  satisficing  set  is 

Sq  =  {a  £  A:  us(a)  —  quR(a )  >  0}.  (1) 

Satisficing  as  defined  above  is  expressed  in  a  single¬ 
agent  context  with  categorical  utilities.  It  is  easily  seen,  in 
this  simple  context,  that  us  and  uR  can  easily  be  combined 
to  form  a  classical  utility  ux(a)  =  us(a )  —  quR{a),  which 
is  amenable  to  optimization.  Optimization,  however,  is  de¬ 
signed  to  produce  a  single  best  solution,  whereas,  by  con¬ 
trast,  satisficing  is  designed  to  produce  a  (possibly)  non¬ 
singleton  set  of  solutions  that  are  good  enough  in  the  sense 
that  the  effectiveness  of  the  action  is  as  least  as  great  as 
its  inefficiency.  In  the  single-agent  context,  satisficing  rep¬ 
resents  a  novel  approach,  but  if  it  is  possible  to  optimize, 
then  there  may  be  little  incentive  to  seek  a  satisficing  solu¬ 
tion.  The  real  power  of  the  satisficing  concept,  however,  is 
manifest  in  the  multi-agent  case,  as  will  be  further  developed 
below. 

3.2  Conditioning 

Let  X  =  (Xi, . . . ,  Xn)  denote  a  collective  of  autonomous 
stake-holders  (e.g.,  agents).  More  specifically,  let  S  =  (Sj , 

. . . ,  Sn)  denote  the  collective  of  selecting  facets,  and  let 
R  =  (f?i, . . . ,  Rn)  denote  the  collective  of  rejecting  facets. 
Notationally,  we  write  write  V  =  SR  =  (Sj , . . . ,  Sn.  Ifi , 

. . .  ,Rn),  a  system  of  2 n  facets.  Since  we  will  be  dealing 
with  the  facets,  rather  than  the  agents,  it  is  convenient  to  use 
the  symbol  Vj,  i  =  1, . . . ,  2 n,  to  denote  either  a  selecting 
facet  or  a  rejecting  facet. 

Let  Ai  denote  a  finite  set  of  alternatives  available  to 
Xi.  Of  course,  if  X,  takes  action  a,  £  A,,  then  that  ac¬ 
tion  also  applies  to  S;  and  II,  (split  personalities  are  not 
allowed,  but  this  does  not  mean  that  Si  and  11,  must  al¬ 
ways  contemplate  taking  the  same  action).  The  product  ac¬ 
tion  space  is  denoted  A.  =  A\  x  •  •  •  x  An,  and  an  action 
profile  a  =  (ai, . . . ,  an )  €  A  denotes  the  joint  action  taken 
by  the  collective. 

A  categorical  utility  for  Vi,  denoted  uv.,  is  a  mapping 
A^R,  and  provides  a  total  ordering  of  all  action  pro¬ 
files  for  Vj.  According  to  the  conventional  Arrow/Friedman 
model,  categorical  utilities  for  all  participants  in  the  multi¬ 
stakeholder  decision  problem  are  defined  prior  to  the  decision¬ 
making  activity  and,  furthermore,  the  mechanisms  that  dic¬ 
tate  the  way  they  are  defined  are  irrelevant.  As  an  alternative, 
we  introduce  the  notion  of  a  conditional  utility.  To  develop 
this  concept,  we  must  first  define  a  commitment. 


Definition  2  Let  Vj  be  an  arbitrary  element  of  V,  and  let 
V j  =  (Vj1 , . . . ,  Vjk )  be  an  arbitrary  fc-element  subset  of  V 
that  does  not  include  Vj.  A  commitment  profile  {a^ , . . . ,  alk  }, 
a.je  £  A  is  a  hypothetical  statement  by  Vj  that  the  ac¬ 
tion  profile  a.je  is  the  one  that  is  most  preferred  by  V:/f , 
l  =  1, . . . ,  k. 

Definition  3  A  conditional  utility  for  Vj  with  respect  to  a 
commitment  profile  (an , . . . ,  a jk  },  denoted  uv. (ala^ , 

. . . ,  a jk ),  is  a  utility  for  Vi  given  that  V;  is  committed  to 

{aji )  ■  •  •  i  xjk  }■ 

Operationally,  a  conditional  utility  for  Vj  serves  as  the  con¬ 
sequent  of  a  hypothetical  proposition  whose  antecedent  is 
a  commitment  by  Vy .  This  expression  does  not  represent 
Vj’s  actual  utility  of  a,  nor  does  it  imply  that  V)f  truly  most 
prefers  aj; , .  Instead,  it  means  that,  if  (Vj, , . . . ,  Vlk )  were 
simultaneously  to  prefer  { a'?  | , . . . ,  a'  k  }  to  all  other  action 
profiles,  then  Vj  would  define  its  utility  of  a  accordingly. 

An  attractive  feature  of  a  conditional  utility  is  that  it  per¬ 
mits  Vj  to  express  conditional  altruism.  To  illustrate,  sup¬ 
pose  uv.( a)  uv.  (a'),  that  is,  Vj  were  to  ascribe  much 
higher  categorical  utility  to  a  than  to  a',  but  Vj  were  to  do 
the  opposite,  ascribing  higher  utility  to  a'  than  to  a,  i.e., 
uv.(a')  >  uvi(sL)-  Vj  could  give  deference  to  Vj  by  replac¬ 
ing  its  categorical  utility  uv.  with  a  conditional  utility  uVi\v. 
such  that  uv.|Vi(a|a)  >  uv.\v.  (a'|a)  but  uv.\v.  (aja')  = 
uVi\v.  (a|a')  =  uVi( a'),  thus  deferring  to  Vj  if,  but  only  if, 
Vj  were  to  favor  a  strongly  over  a'. 

3.3  Social  Networks 

Conditional  preferences  provide  each  individual  with  the  abil¬ 
ity  to  define  its  preferences  as  a  function  of  the  hypothetical 
preferences  of  all  other  subsets  of  the  collective.  This  feature 
represents  an  important  departure  from  the  traditional  cate¬ 
gorical  definitions  of  preference  and  provides  the  foundation 
for  the  modeling  of  a  complex  society  that  possesses  sophis¬ 
ticated  social  relationships  such  as  altruism  (either  benev¬ 
olent  or  malevolent).  Conditional  preference  relations  per¬ 
mit  the  explicit  modeling  of  such  relationships,  rather  than 
merely  simulating  them  by  redefining  categorical  preference 
orderings.  Although  conditional  preference  relations  are  more 
complex  than  are  categorical  ones,  as  noted  by  Palmer,  “Com¬ 
plexity  is  no  argument  against  a  theoretical  approach  if  the 
complexity  arises  not  out  of  the  theory  itself  but  out  of  the 
material  which  any  theory  ought  to  handle”  [32,  p.  176]. 

Nevertheless,  the  introduction  of  conditional  utilities  in¬ 
creases  the  complexity  of  the  mathematical  model  of  a  col¬ 
lective.  At  one  extreme,  all  of  the  members  of  the  collective 
would  be  devoted  to  narrow  self-interest,  and  all  utilities 
would  be  categorical  (the  classical  game-theoretic  model). 
At  the  other  extreme,  each  of  the  members  would  be  influ¬ 
enced  by  the  preferences  of  every  other  member,  resulting 
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in  a  fully  connected  collective.  Fortunately,  however,  many 
potentially  interesting  societies  are  such  that  the  connections 
between  the  members  are  relatively  sparse.  Just  as  with  hu¬ 
man  societies,  it  is  likely  that  members  will  be  organized 
into  relatively  small  clusters  of  individuals  that  are  some¬ 
what  loosely  connected  with  other  clusters.  One  such  model 
is  a  hierarchical  structure,  where  the  preferences  of  superi¬ 
ors  influence  those  of  subordinates.  Another,  more  parallel 
model,  is  one  where  the  individuals  are  grouped  into  func¬ 
tion,  spatial,  or  temporal  neighborhoods.  A  powerful  and 
convenient  way  to  represent  such  relationships  is  through 
graph  theory,  which  provides  a  means  to  express  directly 
the  influence  relationships  that  exist  among  the  individuals. 
With  such  a  formalism,  the  vertices  of  the  graph  represent 
the  members  of  the  collective,  and  the  edges  represent  the 
influence  flows  among  them  as  encoded  in  the  conditional 
utilities.  For  the  extreme  case  where  all  individuals  possess 
categorical  preferences,  the  graph  would  have  no  edges  — 
each  individual  would  be  expressed  by  an  isolated  vertex. 
When  conditional  preferences  exist,  however,  the  graph  will 
have  edges,  as  illustrated  in  Figure  1 . 


Fig.  1  A  directed  acyclic  graph 


In  this  paper  we  concentrate  on  directed  acyclic  graphs, 
or  DAGs.  Formally,  a  directed  graph  is  a  pair  Q  =  (V,  E), 
where  V  =  (Vi, . . . ,  Vin )  is  a  finite  set  of  vertices  and  E  is 
a  set  of  edges  linking  pairs  of  vertices.  If  Vj  is  directly  influ¬ 
enced  by  Vi  but  Vj  does  not  directly  influence  V j ,  then  there 
is  a  directed  edge,  denoted  from  V,  to  Vj.  A  path  from 
Vi  to  Vj  is  a  sequence  of  vertices  {Vi ,  Vfcx ,  142 , . . . ,  Vj  }  such 

that  V  ->  Vkl  -f  Vfca  -► - ►  Vj.  We  write  V  >->  Vj  if 

there  is  a  path  from  Vi  to  Vj.  If  there  are  no  paths  such  that 
Vi  i — >  Vj  for  any  i,  the  graph  is  said  to  be  acyclic. 

If  Vi  — ►  Vj,  then  Vi  is  called  a  parent  of  Vj,  and  Vj  is 
a  child  of  V..  The  set  of  parents  of  V.  is  denoted  pa  (Vi)  = 
{Vij'-  Vij  “ Vi,j  =  1 . . .  ,Pi},  and  the  set  of  children  of 
Vi  is  denoted  ch  (Vi).  If  V)  has  no  parents,  then  pa  (Vi)  = 
0.  The  descendents  of  V.t,  denoted  de  (V)),  is  the  subset  of 
vertices  {V)m:  V)  i— >  Vim ,  m  =  1 . . . ,  di}. 

Let  cp  (Vi)  =  {vjx , . . . ,  Xip  }  denote  the  commitment 
profile  for  pa  (Vi).  For  each  Vi,  uv. t  pa(Vi)[xl  cp  (V))]  is  the 
utility  that  Vi  ascribes  to  x,  given  that  Vtj  commits  to  Xj . , 


j  =  1, . . .  ,pi.  If  V.  has  no  parents,  the  conditional  utility  is 
the  categorical  utility;  i.e.,  uv.lpa(Vi)  =  uVi  if  pa  (Vi)  =  0. 
Consider  the  DAG  illustrated  in  Figure  1.  By  inspection, 

pa  (Si)  =  pa(i?3)  =  pa  (f?4)  =  0,  pa(S2)  =  {f?i}, 
pa  (S3)  =  {S2,  S4,  Ri},  pa  (S4)  =  {R3},  pa  (Ri)  =  {-R2}, 
and  pa(i?2)  =  {Si}. 

A  fundamental  property  of  a  DAG  is  the  Markov  con¬ 
dition-.  nondescendent  nonparents  of  a  vertex  have  no  influ¬ 
ence  on  the  vertex,  given  the  state  of  its  parent  vertices  [10]. 
Consequently,  if  a  society  can  be  represented  as  a  DAG,  the 
conditional  utility  of  a  facet  is  dependent  only  upon  the  com¬ 
mitments  of  its  parents.  Thus,  for  the  DAG  in  Figure  1,  J?2  is 
influenced  only  by  the  commitments  of  Si,  S3  is  influenced 
by  the  commitments  of  S2,  S4,  and  Il  \ ,  and  so  forth.  Thus, 
conditional  utility  of  f?2  is  of  the  form  u  r,2 C[)  ( n,2 ,  where 
cp  (f?2)  =  {Si},  and  the  conditional  utility  of  S3  is  of  the 
form  Us3i  cp  (s3j,  where  cp  (S3)  =  {S2,  S4,  f?i}.  Categorical 
utilities  are  associated  with  the  root  nodes,  S2,  R3,  and  f?4, 
since  these  nodes  have  no  parents. 

4  Social  Welfare 

4. 1  Collective  Preferences 

The  central  question  for  a  collective  of  autonomous  decision 
makers  is  how  they  should  function  as  a  group.  In  the  classi¬ 
cal  non-cooperative  game-theoretic  formulation,  the  notion 
of  a  group  preference  is  irrelevant  —  each  individual  is  com¬ 
mitted  to,  and  only  to,  its  own  satisfaction,  and  the  emer¬ 
gence  of  a  coherent  notion  of  group  welfare  would  be  strictly 
coincidental.  As  observed  by  Shubik,  “It  may  be  meaning¬ 
ful,  in  a  given  setting,  to  say  that  group  ’chooses’  or  ’de¬ 
cides’  something.  It  is  rather  less  likely  to  be  meaningful 
to  say  that  the  group  ’wants’  or  ’prefers’  something”  [42, 
p.  124].  Social  choice  theory,  on  the  other  hand,  focuses  on 
the  aggregation  of  individual  preferences  to  form  a  social 
welfare  function  that  can  be  used  to  define  what  is  best  for 
the  group.  Classical  social  choice  theory,  however,  as  de¬ 
veloped  by  Arrow  [1],  Debreu  [12],  Fishburn  [16],  and  oth¬ 
ers,  also  relies  upon  categorical  preferences,  as  does  multi¬ 
objective  decision  theory  [21].  The  main  classical  result,  at¬ 
tributed  to  Debreu,  is  that  a  necessary  and  sufficient  for  a 
group  utility  to  be  defined  as  the  weighted  sum  of  individual 
utilities  is  that  the  individual  utilities  must  be  categorical. 

In  the  presence  of  conditional  preferences,  the  issue  of 
social  welfare  takes  on  added  complexity.  For  example,  the 
traditional  axioms  of  social  choice  theory,  such  as  the  in¬ 
dependence  of  irrelevant  alternatives,  becomes  problematic. 
Thus,  we  must  pursue  a  different  course  when  aggregating 
conditional  preferences.  In  the  interest  of  clarity,  we  begin 
our  discussion  of  this  concept  with  the  bi-agent  case,  with 
V  =  (Vi,  V2).  Let  us  suppose  that  V\  possesses  a  categori¬ 
cal  utility  uVl  and  V2  possesses  a  conditional  utility  uV2  |Vl. 
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The  corresponding  DAG  is  displayed  in  Figure  4.1.  Given 
these  utilities,  the  central  questions  are:  (i)  Can  these  two 
utilities  be  combined  in  a  rational  way  to  form  a  group  util¬ 
ity?  and,  if  so,  (ii)  How  should  they  be  combined? 


Fig.  2  A  two-agent  DAG 


To  address  this  issue,  we  introduce  the  notion  of  a  joint 
commitment.  A  joint  commitment  by  (Vj ,  Vj)  is  a  condition 
that,  simultaneously,  Vj  is  committed  to  (ai  ,02)  £  Ai  x  A2 
and  Vj  is  committed  to  (aj,  aj)  £  A\X  A2-  The  utility  of  a 
joint  commitment  would  provide  a  complete  description  of 
the  way  the  collective  views  all  possible  consequence  pro¬ 
files  (one  for  each  decision  maker).  It  would  provide  infor¬ 
mation  regarding  the  degree  of  conflict  and  the  possibilities 
for  compromise,  since  only  one  profile  can  actually  be  im¬ 
plemented  by  the  collective. 

If  there  were  no  conflicts,  then  there  would  exist  a  joint 
commitment  of  the  form  [(aj,aj),  (aj,aj)]  that  would  si¬ 
multaneously  maximize  benefit  to  V\  and  Vj  and,  hence,  by 
the  Pareto  principle,  to  the  collective.  In  the  presence  of  con¬ 
flicts,  however,  joint  commitments  of  the  form  [(ai,  a2)(ai, 
02)],  where  both  commit  to  the  same  profile,  would  repre¬ 
sent  a  compromise  solution.  The  issue,  then,  is  to  define  an 
acceptable  compromise. 

To  determine  the  utility  of  a  joint  commitment,  consider 
the  following  argument.  For  [(ai,  02),  (aj,  aj)]  to  be  a  joint 
commitment,  it  is  necessary  that  (ai,  <12)  be  a  commitment 
by  V\.  But  if  (ai,  <22)  is  a  commitment  by  Vj,  then  for  (aj,  a'2 ) 
to  be  a  commitment  by  Vj,  (aj,  aj )  must  be  a  commitment 
by  Vj  given  that  (ai,  <22)  is  a  commitment  by  Vj.  Further¬ 
more,  if  (ai,  02)  is  not  a  commitment  by  Vj,  then  [(ai,  0,2), 
(aj,  a'2)\  is  not  a  joint  commitment,  regardless  of  whether  or 
not  (aj,  aj)  is  a  commitment  by  Vj-  Thus,  when  consider¬ 
ing  the  utility  of  a  joint  commitment  to  [(ai,  <22),  (aj,  aj)], 
if  the  utility  of  a  commitment  to  (a± ,  02)  by  Vj  is  considered 
first,  then  the  utility  of  a  commitment  to  (aj,  aj)  by  Vj  will 
be  relevant  only  if  (ai,  02)  is  a  commitment  by  Vj.  Conse¬ 
quently,  given  the  utility  of  a  commitment  to  (<zi,  02)  by  Vj 
and  the  conditional  utility  of  a  commitment  to  (aj,  aj )  by 
Vj  given  that  (ai ,  a2 )  is  a  commitment  by  Vj,  knowledge  of 
the  categorical  utility  of  a  commitment  to  (aj,  aj)  by  Vj  is 
not  required  in  order  to  compute  the  utility  of  a  joint  com¬ 
mitment  to  [(ai, xa),  (aj,  a'2)\.  Thus,  the  utility  of  a  joint 
commitment  to  [(ai,  02),  (aj,  aj)]  is  a  function  of  the  cat¬ 
egorical  utility  of  a  commitment  to  (01,02)  by  Vj  and  the 
conditional  utility  of  a  commitment  by  Vj  to  (aj,  aj)  given 
that  (ai ,  02)  is  a  commitment  by  Vj. 


Let  uVlv2  denote  the  utility  of  a  joint  commitment.  By 
the  above  arguments,  this  function  can  be  expressed  as 

^ViV2  [(al)  °2),  (ttl,  ££2)]  = 

F[uVl  (ai,  a2),  uV2lVl  (aj,  aj|ai,  a2)]  (2) 

for  some  function  F,  called  the  aggregation  function. 

4.2  Aggregation 

Obviously,  there  are  many  possibilities  for  F,  and  to  nar¬ 
row  the  choices,  it  is  necessary  to  impose  some  additional 
constraints.  One  reasonable  constraint  is  that  the  collective 
possess  at  least  a  weak  sense  of  equity  so  that  a  meaning¬ 
ful  notion  of  cooperation  can  occur.  Specifically,  we  wish  to 
avoid  a  condition  of  categorical  subjugation.  To  introduce 
this  concept,  let  us  restrict  interest  to  the  collective  Sj  and 
Vj.  We  shall  say  that  Sj  is  categorically  subjugated  to  the 
collective  if  every  consequence  profile  that  is  acceptable  to 
the  collective  would  require  Si  to  sacrifice  its  performance. 
Suppose  that 

us1(ah  a2)  ^  US1  (%  ,  a2),  (3) 

but 

Us1v2l(a[,a2),(a1,a2)\  <  uSlv2  [(aj',  aj'),  (ai,  a2)]  (4) 

for  all  (ai,  02)  £  Ai  x  A2.  Then  Si  would  be  categorically 
subjugated,  since  Si’s  preferred  joint  action  can  never  be 
preferred  by  the  society.  Avoiding  categorical  subjugation 
ensures  that  all  participants  have  a  “seat  at  the  table”  when 
negotiating.  Otherwise,  the  interests  of  some  facets  will  be 
so  contrary  to  the  interests  of  the  collective  that,  no  mat¬ 
ter  what  the  collective  decides,  the  interests  of  the  affected 
individual  facets  will  be  suppressed.  Unless  the  possibility 
(although  not  the  guarantee)  exists  that  the  interests  of  the 
individual  are  compatible  with  the  interests  of  the  collective, 
the  individual  will  be  effectively  disenfranchised.  Although 
categorical  subjugation  is  not  always  avoided  in  human  so¬ 
cieties  (e.g.,  dictatorships),  avoiding  categorical  subjugation 
is  an  important  feature  of  an  artificial  society  that  must  ne¬ 
gotiate  to  reach  a  compromise. 

If  categorical  subjugation  is  to  be  avoided,  then  there 
must  exist  an  action  profile  (ai,  <22)  such  that,  if  (3)  holds, 
then 

wSlv2[(al,a'2),  (ai,a2)]  >  «s1v2[(aj',aj/),  (ai,a2)].  (5) 

A  similar  argument  regarding  the  categorical  subjugation  of, 
say,  Ri  can  be  made  with  the  inequalities  reversed  in  (3),  (4), 
and  (5)  when  Ri  replaces  Sj. 

The  question  now  becomes:  what  conditions  are  neces¬ 
sary  to  impose  upon  the  aggregation  function  F  to  ensure 
that  categorical  subjugation  can  never  occur?  To  address  this 


question,  let  us  turn  to  an  analogous  issue.  A  Dutch  book  is 
a  gambling  situation  such  that,  no  matter  what  the  outcome, 
the  gambler  will  be  worse  off  for  having  taken  the  gamble 
—  a  situation  of  sure  loss  (one’s  reward  is  always  less  than 
one’s  stake).  To  illustrate  a  Dutch  book.  Suppose  Y  can  take 
one  of  two  distinct  values:  yi  or  y 2,  and  let  q(y)  denote  a  be¬ 
lief  function  of  y;  that  is  q(y)  measures  the  strength  of  belief 
that  Y  =  y.  Without  loss  of  generality,  we  may  restrict  be¬ 
lief  functions  to  the  unit  interval;  that  is,  0  <  q(y)  <  1.  (We 
refrain  from  using  the  term  “probability”  here,  since  we  do 
not  require  q  to  possess  all  of  the  properties  of  a  probability 
mass  function.) 

By  convention,  we  will  assume  that  we  have  full  belief 
that  exactly  one  of  these  values  obtains,  that  is,  that  the  dis¬ 
junction  of  y  1  and  y2  must  occur,  and  that  beliefs  are  addi¬ 
tive,  thus, 

q(yi  v  2/2)  =  q(yi)  +  <2(2/2)  =  l-  (6) 

Now  let  Z  take  on  one  of  two  distinct  values  z\  or  Z2,  and 
let  r(z,  y)  denote  the  belief  that  Z  =  z  and  Y  =  y  simulta¬ 
neously.  Let  us  now  assume  that 

q(y 2)  >  q(yi)  (7) 

r(zi,y2)  <  r(z1,y1)  (8) 

r(z2,y2)  <  r(z2,yf).  (9) 

The  following  example  illustrates  a  Dutch  book.  Suppose 
you  purchase  a  $1  gamble  that  Y  =  y2,  and  deem  a  fair 
purchase  price  to  be  q(y2)\  that  is,  you  pay  %q(y2)  for  the 
gamble  to  win  $1.  Now  also  suppose  you  sell  the  gamble 
(zi,  y2 )  V  (z2 , 2/2 )  -  By  additivity  of  beliefs,  a  fair  selling  price 
for  this  bet  would  be  r[(z\,y2)  V  (z2,y2)\  =  r(zi,y2 )  + 
r(z2,y2).  However,  according  to  the  above  ordering,  you 
must  have  q{y2)  >  \  and,  since  r(zi,y2)  +  r(z2,y2)  < 
r(zi,yi)  +  r(z2,yi),  it  follows  that  r[(zi,  y2)  V  (z2,  y2)]  < 

After  all  gambles  have  been  bought  and  sold,  your  net 
wealth  is  r[(z±,y2)  V  {z2,y2)\  —  q{y2)  <  0.  To  overcome 
this  loss,  you  hope  to  make  up  the  difference  once  the  out¬ 
come  of  the  gamble  is  known.  But  if  neither  y2  nor  (zi  ,y2)V 
(z2,  y2)  occur,  you  win  nothing  and  you  pay  nothing,  and  if 
(zi,y2)  V  ( z2 ,  y2)  occurs,  then,  of  course,  y2  occurs,  so  you 
win  $1  which  you  must  pay  to  the  buyer  of  your  gamble. 
Thus,  once  the  gambles  have  been  bought  and  sold,  your  net 
wealth  is  invariant  to  whatever  happens  —  you  suffer  a  sure 
loss. 

A  belief  system  is  said  to  be  coherent  if  it  is  not  pos¬ 
sible  to  construct  a  Dutch  book.  The  Dutch  Book  Theo¬ 
rem  [11,38]  and  its  converse  [22]  state  that  a  belief  sys¬ 
tem  is  coherent  if  and  only  if  it  complies  with  a  probabil¬ 
ity  measure  that  describes  the  degrees  of  belief  regarding 
the  propositions  under  consideration.  The  above  example 
does  not  comply  with  the  laws  of  probability  theory,  since 
q(y 2)  ^  r(zi,y2)  +  r(z2,y2)\  that  is,  marginalization  fails. 


The  above  discussion  illustrates  the  fact  that  categori¬ 
cal  subjugation  and  sure  loss  are  mathematically  equivalent. 
Thus,  if  a  multi-agent  valuation  system  is  to  be  coherent, 
in  that  it  is  not  not  possible  to  construct  a  situation  where 
categorical  subjugation  can  occur,  then  the  valuation  system 
must  comply  with  the  mathematical  structure  of  probability 
theory. 

Definition  4  Let  uv.  denote  a  categorical  utility  for  VL.  The 
collective  V  is  coherent  if,  for  each  i  £  {1, . . . ,  2 n},  given 
thatuVi(a)  >  uv.  (a'),  there  exists  a  commitment  sub-profile 
(ai, . . .  ,aj_i,ai+i, . . .  ,a2„)  such  that 

UVl  V2n  (al 1  •  •  •  ;  3-i-l  J  a,  •  •  •  )  3-2 n)  > 

UVl  V2n  (al  J  •  •  •  )  ai  —  li  3  >  3j_|_i,  .  .  .  ,  a.2n)  (10) 

if  Vi  is  a  selecting  facet,  with  the  inequalities  reversed  if  Vi 
is  a  rejecting  facet. 

Let  V  be  a  group  of  decision  making  facets  whose  influ¬ 
ence  relationships  can  be  expressed  with  a  directed  acyclic 
graph.  For  each  Vi,  let  pa  (Vi)  =  (V^ , ,Vip.)  denote  the 
Pi  parents  of  Vi,  and  let  A?1  =  A  x  •  •  •  x  A  (pi  times)  de¬ 
note  the  pi-fold  product  of  the  joint  action  space  correspond¬ 
ing  to  the  parents  of  V).  If  V,  has  no  parents,  then  A1’’  =  0. 
Let  cp  (Vi)  =  (aj1 , ...  ,&i  )  denote  the  commitment  pro¬ 
file  for  pa  (Vi).  For  each  V),  uv.\  pa(vi)[a|  cp  (14)]  is  the  util¬ 
ity  that  Vi  ascribes  to  a,  given  that  Vt]  commits  to  a, . ,  j  = 
1 , ...  ,pi.  If  V.  has  no  parents,  the  conditional  utility  is  the 
categorical  utility;  i.e.,  uv.lpB.(Vi)  =  uv.  if  pa  (Vi)  =  0. 

Theorem  1  If  a  society  can  be  represented  as  a  directed 
acyclic  graph,  categorical  subjugation  cannot  occur  if  and 
only  if  the  utilities  uVil  pa(v4)  are  conditional  mass  functions. 
That  is. 


Mvi|pa(Vi)[3|cp(V))]  >  OVx  e  A 

(11) 

and 

Y  MVj|pa(Vi)[a  CP  (Xi)]  =  1 

(12) 

for  all  (a^ , . . . ,  a!p . )  £  APi ).  Furthermore,  the  utility  of  a 
joint  commitment  to  (ai, . .  . ,  a2n)  is 

2  n 

fiv(ai,...a2n)  =  n  uVi  I  p»  (n)  [a<  1  CP  <y0] 

i=l 

(13) 

or,  more  specifically, 

uSR(ai,  ■  ■  ■  a„,  a1; . . . ,  an )  = 

Ti  n 

IT  IT  USi|pa(Si)[ai  Cp(Si)]ufl.|pa(Bj.)[a' 

*=lj=l 

|cp  (Rj)], 

(14) 

where  ai  is  the  commitment  by  Si,  i  =  1, . . . . 
the  commitment  by  Rj,  j  =  1 , ...  ,n. 

,  n  and  a)  is 
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Proof:  Mathematically  (albeit  with  different  seman¬ 
tics),  we  may  view  V.  as  random  variables  defined  over  the 
sample  spaces  Vi,  i  =  1, . . . ,  2 n.  The  Dutch  Book  Theo¬ 
rem  and  its  converse  establish  that  the  necessary  and  suf¬ 
ficient  condition  to  ensure  that  sure  loss  (categorical  sub¬ 
jugation)  cannot  occur  is  that  uv.\vaiYi)  must  correspond 
to  the  conditional  probability  mass  functions  of  V,  given 
cp(l^).  Thus,  the  categorical  utilities  of  the  root  vertices 
must  possesses  the  mathematical  structure  of  marginal  prob¬ 
ability  mass  function  and  the  conditional  utility  of  non-root 
vertices  possesses  the  mathematical  structure  of  conditional 
probability  mass  functions.  Consequently,  the  vertices  and 
edges  of  the  DAG  satisfy  all  of  the  conditions  of  a  Bayesian 
network,  and  we  may  apply  the  fundamental  theorem  of 
Bayesian  networks;  namely,  that  the  joint  probability  mass 
function  of  the  random  variables  associated  with  the  vertices 
is  the  product  of  the  conditional  probability  mass  functions 
of  all  non-root  vertices,  and  the  marginal  mass  functions  of 
all  root  vertices  [9, 19, 34].  Equation  (13)  is  simply  an  appli¬ 
cation  of  the  law  of  compound  probability.  Thus,  coherence 
is  established.  q 

We  will  term  utilities  that  comply  with  Theorem  1  prax- 
eic  utilities.  It  should  be  noted  that  this  formulation  requires 
all  utilities  to  be  non-negative  and  sum  to  unity.  This  restric¬ 
tion,  however,  does  not  reduce  the  generality  of  the  theorem, 
since  utilities  can  be  subjected  to  positive  affine  transforma¬ 
tions  without  affecting  the  solution. 

Equation  (14)  expresses  the  values  of  the  selecting  and 
rejecting  facets  simultaneously.  Since  parents  of  selecting 
facets  may  comprise  both  selecting  and  rejecting  facets  and 
similarly  for  the  parents  of  rejecting  facets,  this  function 
contains  all  of  the  possibilities  for  compromise  and  conflict. 
To  be  useful  for  decision  making,  however,  it  is  necessary 
to  compute  the  joint  selectability  for  all  joint  commitments 
by  the  selecting  facets,  and  the  joint  rejectability  for  all  joint 
commitments  by  the  rejecting  facets.  Since  uSR  is  a  multi¬ 
variate  mass  function,  we  may  compute  the  joint  selectabil¬ 
ity  and  rejectability  marginals  as 


us(a\, . . 

•>an)= 

•  j  am  aii  ■ 

••<)  (15) 

ur( af . . 

■  >an)  =  X/ 

V'SR.fa  1  5  '  • 

•  j  am  aii  ■ 

■•<)  (16) 

Once  the  joint  selectability  and  rejectability  marginals 
have  been  computed,  we  are  in  a  position  to  define  a  sat¬ 
isficing  social  welfare  function.  We  first  observe  that,  since 
only  one  consequence  profile  can  be  implemented,  to  make 
a  decision,  we  must  ascribe  the  same  commitment  to  each 
facet,  yielding  the  joint  praxeic  selectability  and  joint  prax- 
eic  rejectability 

(17) 


and 

uR(a)  =iiR(a,.. .  ,a).  (18) 

We  next  define  satisficing  social  welfare  function 
W{ a)  =  (a)  -  quR{ a)  (19) 

and  the  jointly  satisficing  set 

Zq  =  {a:  W( a)  >  0}.  (20) 

The  parameter  q  is  a  measure  of  caution.  Nominally,  q  =  1, 
but  as  q  decreases,  the  number  of  consequence  profiles  that 
are  rejected  decreases.  As  will  subsequently  become  appar¬ 
ent,  another  interpretation  of  q  is  as  an  index  of  negotiation, 
since  lowering  q  enlarges  the  satisficing  set,  thereby  increas¬ 
ing  the  opportunities  for  reaching  a  compromise.  We  will  de¬ 
fine  all  consequence  profiles  such  that  the  satisficing  social 
welfare  is  non-negative  as  being  satisficing. 

We  may  also  compute  the  individual  selectability  and 
rejectability  marginal  utilities  as 

USi(ai)  =  5Z^s(al’  ‘  ‘  ’a«)  (21) 

-ia  i 

and 

uR.(aj)  =  y~]uH(ai, . .  ■  ,a»),  (22) 

-■a  i 

where  we  have  employed  the  so-called  not-sum  notation; 
namely,  a  to  mean  that  the  sum  is  taken  over  all  ay  for 

j  ^  i- 

The  individual  welfare  function  is 

Wi( a)  =  us.( a*)  -  qiURi{ai)  (23) 

and  the  individually  satisficing  set  is 

r;  =  {a:  Wi( a)  >  0}.  (24) 

The  compromise  set  is  the  set  of  all  joint  actions  that  are 
simultaneously  satisficing  for  the  group  and  for  the  individ¬ 
uals;  that  is, 

c  =  Eqrz1qi  n-nr;.  (25) 

A  satisficing  set  (either  for  the  group  or  individuals)  con¬ 
stitutes  the  set  of  consequences  for  which  effectiveness,  as 
measured  by  the  selectability  utility,  is  at  least  as  great  as  qi 
times  the  inefficiency,  as  measured  by  the  rejectability  inu¬ 
tility.  Rather  than  focusing  on  seeking  the  best  and  only  the 
best  solution,  the  satisficing  methodology  focuses  on  elimi¬ 
nating  bad  solutions.  Since  the  satisficing  set  eliminates  all 
alternatives  whose  effectiveness  does  not  exceed  their  effi¬ 
ciency,  it  is  optimal  in  the  sense  that  it  eliminates  the  max¬ 
imum  number  of  bad  choices.  If,  in  the  extreme  case,  all 
but  one  choice  are  eliminated,  then  the  satisficing  solution 


us{ a)  =  us (a, . . .  ,a) 
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coincides  with  the  optimal  solution.  Thus,  far  from  being  a 
boundedly  rational  solution,  the  set  of  satisficing  solutions 
possess  a  well-defined  notion  of  optimality  (albeit  differ¬ 
ent).  Thus,  we  come  to  Euler’s  conclusion  through  the  “back 
door.” 

Example  1  The  Social  Prisoner’s  Dilemma.  The  convention¬ 
al  Prisoner’s  Dilemma  game  is  designed  to  characterize  be¬ 
havior  between  two  decision  makers  in  an  environment  were 
cooperation  leads  to  better  results  than  does  defection  but, 
if  only  one  attempts  to  cooperate,  that  individual  becomes 
vulnerable  to  being  exploited  by  the  other.  Classically,  this 
game  is  defined  in  terms  of  categorical  utilities.  Let  C  and 
D  denote  cooperation  and  defection,  respectively.  The  cor¬ 
responding  categorical  utilities  are  the  entries  of  the  payoff 
matrix  displayed  in  Figure  1.  The  joint  option  ( C,C )  (next 
best  for  both)  is  the  Pareto  optimal  solution,  while  ( D1  D) 
(next  worst  for  both)  is  the  Nash  equilibrium  solution.  No¬ 
tice  that  the  game  is  symmetrical.  The  classical  assumption 
for  this  game  is  that  there  is  no  social  relationship  between 
the  players,  and  that  each  is  intent  on,  and  only  on,  maxi¬ 
mizing  its  own  welfare,  regardless  of  the  effect  doing  so  has 
on  the  other. 

Table  1  The  payoff  matrix  for  the  conventional  Prisoner’s  Dilemma 
game. 


X2 


*1 

C 

D 

c 

(3,  3) 

(1,4) 

D 

(4,  1) 

(2,  2) 

Now  let  us  add  some  social  context  to  this  problem.  Sup¬ 
pose  a  leader-follower  relationship  exists  between  them,  with 
X  [  being  the  leader  and  X2  the  follower.  We  shall  assume 
that  X\  follows  the  conventional  structure  of  maximizing 
payoff,  but  X 2  is  interested  in  (a)  following  the  lead  of  X\, 
(b)  resisting  exploitation,  and  (c)  not  offending  X\  by  taking 
advantage  of  the  possible  propensity  for  X\  to  cooperate. 
We  shall  take  the  definition  of  selectability  as  the  same  as 
with  the  conventional  formulation;  namely,  to  seek  to  max¬ 
imize  payoff.  For  rejectability,  however,  we  invoke  a  com¬ 
ponent  that  is  not  present  in  the  conventional  formulation; 
namely,  to  account  for  social  issues,  and  assume  that  the 
players  have  a  unit  of  social  resource  they  may  commit  to 
each  outcome.  Since  the  leader  has  no  social  commitments, 
we  take  rejectability  as  the  same  for  each  outcome.  Accord¬ 
ingly,  the  categorical  selectability  and  rejectability  values 
for  the  leader  are  provided  in  Table  2. 

To  account  for  the  social  context,  we  take  the  utilities  for 
X2  to  be  conditional,  and  assume  that  both  selectability  and 
the  rejectability  of  X2  are  influenced  by  the  selectability  of 
X\,  as  indicated  in  Figure  3. 


Table  2  The  categorical  selectability  and  rejectability  for  the  Pris¬ 
oner’s  Dilemma  leader. 


(C,C)  ( C,D )  ( D,C )  (D,  D) 


Fig.  3  The  influence  network  for  the  social  Prisoner’s  Dilemma  game 


Table  3  displays  the  conditional  selectability  for  X2  given 
the  commitments  of  X\.  If  X\  were  to  commit  to  (C,  C), 
(( D ,  C),  or  (I).  D),  then  X2  would  do  likewise  in  the  inter¬ 
est  of  maximizing  its  payoff.  But  if  if  X\  were  to  commit 
to  (C,  D),  then  X2  would  resist  being  exploited  by  placing 
zero  conditional  utility  on  (C,  D)  and  apportioning  equally 
to  the  other  outcomes. 

Table  3  R2’ s  conditional  selectability  for  the  social  Prisoner’s 
Dilemma  game. 


(x\,X2) 


(C,C) 

(C,D) 

(D,D) 

(D,D) 

|  Si  (*£  1 5  x2\c,  CJ 

1 

0 

0 

0 

«S2|Si  [xi,x2\ c,  D) 

1 

3 

0 

1 

3 

1 

3 

^S2  |Si  (•£  1  ?  %2  |-^>  C*) 

0 

0 

1 

0 

^ S 2  1  ■S'l  (*^1 5  x2  -^) 

0 

0 

0 

1 

Table  4  displays  the  conditional  rejectability  for  X2  given 
the  commitments  of  X±.  If  X-\  were  to  commit  to  (C,  C), 
then  X2  would  place  zero  conditional  rejectability  on  that 
outcome  and  apportion  all  of  its  conditional  rejectability  equal¬ 
ly  to  the  other  outcomes.  If  X\  were  to  commit  to  (C,  D), 
X2  would  place  all  of  its  rejectability  on  that  outcome  to  en¬ 
sure  it  will  not  be  exploited.  If  X2  were  to  commit  to  (D,  C), 
then  X2  would  not  reject  that  outcome  so  as  to  not  exploit 
X±,  and  instead  would  reject  exploitation  by  placing  its  con¬ 
ditional  rejectability  on  (C,  D).  Finally,  if  X\  were  to  com¬ 
mit  to  (£),  D),  X2  would  not  reject  that  outcome,  but  would 
instead  reject  (C,  D)  as  before. 

The  utility  of  a  joint  commitment  is  given  by 

MSls2RlH2  [(x1,X2),  (x[,  x'2),  (2/1,  2/2),  (2/1, 2/2)]  = 

USl  (xi ,  X2)uS2\Sl  (x'1,x'2)\xi,x2) 

uRl(yi,y2)uRl\Sl[(y'1,y,2)}.  (26) 

Thejointpraxeic  selectability  and  rejectability  functions, 
as  defined  by  (17)  and  (18)  are  given  in  Table  5,  and  the 
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Table  4  Ft2’s  conditional  rejectability  for  the  social  Prisoner’s 
Dilemma  game. 


(2:1,  x2) 


(C,C) 

( C,D ) 

(D,D) 

(D,D) 

Mfi2|Si(^l,^2|C,  C) 

0 

1 

3 

1 

3 

1 

3 

«H2|Si  (x1,x2\C,  D) 

0 

l 

0 

0 

Ur2  |Si  (xi,x2 \D,C) 

0 

1 

0 

0 

^ i?2  1  “^1  (*^1 )  *^2  |-^j 

0 

l 

0 

0 

jointly  satisficing  set  (with  q  =  1)  is  {(C,  C),  (D,  D)}. 
The  individual  selectability  and  rejectability  marginal  util¬ 
ities,  as  defined  by  (21)  and  (22)  are  displayed  in  Table  6, 
from  which  it  can  be  seen  that  the  individually  satisficing 
sets  (with  q1  =  q2  =  1)  are  E}  =  {(C,  C),  (D,  C)}  and 
Ef  =  {(C,  C),  ( D ,  D)}.  Consequently,  the  compromise  set 
is  C  =  {(C,  C)}. 

Under  the  classical  formulation  of  the  Prisoner’s  Dilem¬ 
ma,  the  only  rationally  justifiable  solution  is  mutual  defec¬ 
tion,  since  that  formulation  does  not  take  into  consideration 
any  social  relationships.  From  the  classical  point  of  view, 
mutual  cooperation,  although  Pareto  optimal,  cannot  be  jus¬ 
tified.  The  social  version  of  the  game  as  developed  here, 
however,  indicates  that  mutual  cooperation  is  the  only  justi¬ 
fied  solution. 


Table  5  The  joint  praxeic  selectability  and  rejectability  for  the  social 
Prisoner’s  Dilemma  game. 


[{xi,x2),  (2:1,12)] 

“SiS2[(2:i,2:2),  (211,2:2)] 

[(C,  cr),  (C.O)] 

0.3 

[(0,0),  (0,0)] 

0.0 

[(0,0,  (0,0)] 

0.0 

[(0,0),  (0,0)] 

0.2 

RR1R2  [(2:1, 2:2),  (211,2:2)] 

[(C,  C),  (C,  C)] 

0.0 

[(0,0),  (0,0)] 

0.2 

[(0,0),  (0,0] 

0.025 

[(D,D),(D,D)\ 

0.025 

Table  6  The  individual  selectability  and  rejectability  utilities  for  the 
social  Prisoner’s  Dilemma  game. 


(xi,  X2) 


uSl{x  1,2:2) 

0.3 

0.1 

0.4 

0.2 

Ur1(x1,X2) 

0.25 

0.25 

0.25 

0.25 

Us2  (xi,  X2) 

0.3 

0.0 

0.0 

0.7 

^R-2  (xl  J  ^2) 

0.25 

0.25 

0.25 

0.25 

5  Reconciliation  with  Classical  Theory 

Not  all  problems  fit  naturally  into  the  dual-utility  structure  of 
satisficing  theory.  One  way  to  deal  with  this  situation,  while 
still  retaining  some  of  the  flavor  of  satisficing  theory,  is  to 
invoke  the  assumption  that  all  consequences  are  rejectabil¬ 
ity  neutral,  and  ascribe  all  meaningful  utility  to  selectability. 
Under  this  situation,  we  set  the  rejectability  to  a  constant: 
uRj  1  pa  (Rj)  [ai  |  cp  (-Rj)]  =  Kj  =  p-j  (|  •  I  denotes  cardinal¬ 
ity)  for  all  a.j .  We  define  the  conditional  utility  of  X,  as 

uxt\ Pa(x4)  [ai |  cp  (^Q)]  =  tiSiipa(Si)[ai|  cp  (Si)].  (27) 

Thus,  (14)  becomes  a  function  of  a;,  *  =  1 . . . . ,  n,  only,  and 
we  may  write 

n 

ux(  ai,...a„)  =  JJ-ux.|pa(x.)[a,;|cp(Xi)],  (28) 

2=1 

and  the  marginals  become 

uXi{ai)  =  ^tix(ai, . . .  ,a„).  (29) 

->a  i 

Once  all  of  the  valuations  are  concentrated  in  a  single 
utility,  we  view  the  decision  problem  from  the  classical  per¬ 
spective  of  optimization.  The  most  well-known  solution  con¬ 
cept  for  individuals  is  the  non-cooperative  game  theoretic 
concept  of  Nash  equilibria  [31].  Let  a*  =  (aj, . . . ,  a* ).  The 
action  profile  a*  is  a  Nash  equilibrium  if,  were  any  single 
individual  to  alter  its  choice,  its  utility  would  decrease  ;  i.e., 
if  a^  =  (al, . . . , a', . . . , a*),  then,  in  terms  of  categorical 
utilities, 

MXi(a*)  >  uXi(af)  (30) 

for  all  a-  €  At  \  {a* }  for  i  =  1 ,n. 

When  conditional  utilities  are  involved,  we  may  define 
two  notions  of  equilibrium.  First,  let  us  define  what  might 
be  called  a  conditional  Nash  equilibrium.  The  action  profile 
a*  is  a  conditional  Nash  equilibrium  if 

ttXj|pa.(Xj)(a*|a*, . . .  ,a*))  >  Mx.|pa(Xi)(a^|a^, . . .  ,a^)) 

(31) 

for  all  a-  G  A,;  \  {a* }  for  i  =  1, . . . ,  n. 

We  may  also  compute  the  Nash  equilibrium  in  terms  of 
the  marginal  utility  defined  by  (29).  The  action  profile  a*  is 
a  Nash  equilibrium  if 

Mx,(a*)  >  ux.(af)  (32) 

Example  2  Prisoner’s  Dilemma,  Continued.  Let  us  revisit 
the  Prisoner’s  Dilemma  discussed  in  Example  1  under  the 
assumption  of  neutral  rejectability,  and  set  uXl  =  uSl  and 
«x2ipa(x2)  =  Us2|pa(x2)  as  defined  in  Tables  2  and  3,  re¬ 
spectively.  By  inspection,  we  see  that  the  conditional  Nash 
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Table  7  The  marginal  utilities  for  the  Prisoner’s  Dilemma. 


(C,C) 

(C,D) 

(D,C) 

(Pi  D) 

Uxl  TO 

1 

10 

4 

10 

2 

10 

Ux2  TO 

0 

0 

7 

10 

equilibrium  is  (D,  D),  as  with  the  conventional  formulation. 
Furthermore,  the  marginal  utilities  are  given  in  Table  7,  and 
we  see  that,  again  the  Nash  equilibrium  is  (D,  D). 

The  Nash  equilibrium  is  usually  considered  to  be  an  ap¬ 
propriate  solution  concept  for  non-cooperative  games.  On 
the  other  hand,  with  a  cooperative  game  (i.e.,  one  where 
binding  agreements  are  possible),  it  may  be  possible  to  en¬ 
ter  into  negotiations  and  bargain  for  a  solution.  For  the  play¬ 
ers  to  forge  an  agreement,  however,  each  must  achieve  an 
acceptable  degree  of  satisfaction.  A  bargaining  game  is  a 
cooperative  game  in  which  each  participant  possesses  a  dis¬ 
agreement  point  that  defines  the  benefit  that  is  guaranteed 
to  accrue  to  it  if  a  compromise  cannot  be  reached.  The  dis¬ 
agreement  point,  therefore,  is  an  indication  of  the  strategic 
strength  that  is  conferred  on  the  participant  as  it  partici¬ 
pates  in  negotiations:  the  higher  the  disagreement  point,  the 
greater  bargaining  strength  of  the  participant. 

A  well-known  bargaining  concept  that  offers  a  clear  def¬ 
inition  of  individual  acceptability  is  the  Nash  bargain  [30], 
which  permits  each  participant  to  make  maximal  use  of  its 
strategic  strength.  The  approach  is  based  on  four  fundamen¬ 
tal  principles:  (i)  invariance  to  positive  affine  transforma¬ 
tions;  (ii)  Pareto  optimality;  (iii)  independence  of  irrelevant 
alternatives,  and  (iv)  symmetry,  which  is  the  notion  that  no 
individual  agent  can  expect  that  the  other  agents  will  grant 
it  better  terms  than  that  individual  itself  would  be  willing  to 
grant,  were  roles  reversed. 

Nash  showed  that  these  four  conditions  lead  to  a  unique 
solution.  Let  dXi  denote  the  disagreement  point  for  X,  .  The 
negotiation  set ,  denoted  A/”,  is  the  subset  of  action  profiles 
such  that  every  participant  achieves  at  least  its  disagreement 
point.  In  terms  of  categorical  utilities,  the  negotiation  set  is 

Af  =  {a  G  A:  ux.( a)  >  dx.,  i  =  1, . . .  ,n}  (33) 

and  the  Nash  bargain  is 

n 

ajv  =  arg  max  TT  [uXi  (a)  -  dx.  ].  (34) 

a£AT  A  A 

i= 1 

The  intuitive  interpretation  of  a  Nash  bargain  is  that  it  de¬ 
fines  a  fair  compromise.  It  enables  each  player  to  take  ad¬ 
vantage  of  the  strategic  strength  endowed  by  its  disagree¬ 
ment  point.  The  higher  X,’s  disagreement  point,  the  more 
action  profiles  that  are  unfavorable  to  it  are  eliminated. 

The  structure  of  (34)  suggests  that  the  optimal  group  so¬ 
lution  can  be  interpreted  as  a  Nash  bargain  with  unilateral 


utilities  replaced  by  conditional  utilities  and  all  disagree¬ 
ment  points  set  to  zero.  Analogously,  therefore,  we  may  de¬ 
fine  a  conditional  Nash  bargaining  solution.  When  decisions 
are  made  under  certainty,  the  negotiation  set  is  defined  as 

Af  =  {a  e  A:  uXi  |  pa  (Xi)  (a)  |  cp  (_Xj )  >  dx.,  i  =  l,...,n}. 

(35) 

The  conditional  Nash  bargaining  solution  is 

n 

aN  =  arg  max  TT  [uXil  pa{Xi)  [a|  cp  pT*)]  -  dx .].  (36) 

a£A/  A  A 

2=1 

Referring  again  to  the  Prisoner’s  Dilemma  example,  it 
is  easily  seen  that  both  the  conditional  Nash  bargain  is  the 
same  as  the  conventional  Nash  bargain  for  the  Prisoner’s 
Dilemma;  namely,  the  Pareto  optimal  solution  (C,  C). 

6  Conditionally  Decoupled  Societies 

6.1  The  General  Case 

The  approach  developed  above  assumes  that  the  conditional 
preferences  are  defined  over  the  entire  product  action  space. 
In  this  respect  conditional  preferences  are  generalizations  of 
classical  categorical  preferences,  the  difference  being  that 
the  preferences  can  be  modulated  by  the  commitments  of 
others.  Although  increased  complexity  is  associated  with  the 
introduction  of  conditional  preferences,  there  are  cases  were 
this  additional  complexity  is  not  justified.  It  can  be  the  case 
that  the  only  commitments  that  affect  the  preferences  of  an 
agent  are  the  direct  consequences  to  its  parents.  This  situa¬ 
tion  motivates  the  notion  of  conditional  decoupling. 

Definition  5  A  society  is  conditionally  decoupled  if  the  con¬ 
ditional  preference  of  each  agent  is  a  function  only  of  its 
own  actions,  given  the  commitments  of  its  parents  to  their 
own  actions. 

Whereas,  for  a  non-decoupled  system,  the  utilities  are 
functions  of  the  entire  action  profile,  for  a  decoupled  system, 
the  utilities  are  functions  of  individual  actions.  To  develop 


this  concept,  suppose  cp  (V.)  =  {a,;, , . . . 
conditional  utility 

,  a.i  ) .  Then  the 

’  LVi  J 

pa  (Vi)[a|cp  (V5)]  =  uVii  pa  (vp  (aja^ , . . 

becomes 

■  i  aLi  ) 

(37) 

UVi  \  pa  (Vi)[ai  I  Cp  (Vi)]  =  Itvj  pafVj)  (ai|®ll  )  • 

•  •  J  aiVi  ) 

(38) 

Then  (14)  becomes 

WsrCoi  ;  •  •  •  a,n ,  a1 , . . . ,  an )  = 

n  n 

n  US. !  pa  CSi)  [dj  I  cp  (Sj )]  Yl  uRj I  pa (R,. )  [a'j  |cp  (Rj)} 

i= 1  3=1 


(39) 
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The  corresponding  joint  selectability  and  rejectability  mar¬ 
ginals  are  given  by 

us(ai,...,an)  =  ^2  uSR(ai,...,an, a'n)  (40) 


and 


ufl(ali  ■  •  ■  i  an)  ~ 


E  ' 


j(ai,  .  .  .  ,  dn,  d11  ,  an). 


(41) 


We  may  now  define  a  social  welfare  function  as 

W(ai,  ...,an)  =  us(ai, . . . ,  an)-qGuR(ai,  ...,an)  (42) 

where  qG  is  the  joint  (-/-value  for  the  group.  The  jointly  sat¬ 
isficing  set  is  the  set  of  action  profiles  that  are  jointly  satis¬ 
ficing  for  the  society  as  a  whole,  and  is  defined  as 

S  =  {(ai, . . .  ,an)  G  A:  W(ai,  ...,a„)>  0}.  (43) 

This  procedure,  however,  does  not  account  for  the  possibil¬ 
ity  that  the  elements  of  S  may  not  be  acceptable  to  all  (or 
any)  of  the  individuals.  Thus,  we  must  also  compute  the  in¬ 
dividual  satisficing  sets.  To  proceed,  we  must  first  compute 
the  selectability  and  rejectability  marginals  as 


If  C  =  0,  then  there  are  no  action  profiles  that  are  si¬ 
multaneously  good  enough  for  the  group  and  each  individ¬ 
ual.  However,  the  satisficing  approach  provides  a  natural  and 
systematic  negotiation  framework  which  which  each  indi¬ 
vidual  man  control  the  degree  to  which  it  is  willing  to  lower 
its  standards  in  an  attempt  to  reach  a  compromise.  By  lower¬ 
ing  its  qi  -value  incrementally,  each  X,  increases  the  size  of 
its  satisficing  set.  By  specifying  the  increment  Aqi  that  X, 
is  willing  to  reduce  its  standards,  each  participant  can  con¬ 
trol  the  amount  of  compromise  it  is  willing  to  offer  others.  If 
enough  participants  are  willing  to  lower  their  r/- values  suf¬ 
ficiently,  it  is  easy  to  see  that,  eventually,  the  consensus  set 
will  be  non-empty,  and  a  best  compromise  can  be  achieved. 
Although  such  negotiations  may  fail  to  reach  a  compromise 
that  is  acceptable  to  all  members,  the  significant  aspect  of 
this  type  of  negotiation  is  that  no  individual  is  a  priori  sub¬ 
jugated  to  the  will  of  the  society  in  the  sense  that  there  is  no 
possibility  for  that  individual’s  preferences  to  receive  con¬ 
sideration.  Thus,  every  individual  can  be  assured  of  receiv¬ 
ing  sufficient  benefit,  by  its  own  definition,  before  agreeing 
to  the  compromise.  If  an  individual  could  not  enjoy  at  least 
that  minimal  assurance,  it  may  not  be  inclined  to  join  or  re¬ 
main  affiliated  with  a  society. 


6.2  Social  Choice 


uSi(ai)  ='^2/Ug(a1,...,an)  (44) 

-i  ai 

and 

uR.(ai)  =^2uR(ai, . . .  ,an),  (45) 

— i  ai 

respectively.  We  may  then  define  the  individually  satisficing 
sets  as 


With  the  general  multi-agent  decision  problem,  each  indi¬ 
vidual  possesses  its  own  action  set.  Some  scenarios,  how¬ 
ever,  are  such  that  there  is  only  one  action  set  that  applies 
to  the  group  as  a  whole.  Scenarios  of  this  type  are  termed 
social  choice  problems.  Thus,  with  a  social  choice  problem, 
there  is  only  one  action  space  A.  The  social  welfare  function 
(42)  becomes 

W(a )  =  us(a, . . .  ,a)  -  qGuR(a,  ...,a)  (50) 


Si  =  {di  £  Ap.  us.(a,i )  -  qiUR.(ai)}. 


(46)  and  the  jointly  satisficing  set  becomes 


this  set  includes  all  alternatives  that  are  satisficing,  or  good 
enough,  for  X,.  The  satisficing  rectangle  is  the  set  of  all 
action  profiles  such  that  each  component  is  individually  sat¬ 
isficing,  and  is  given  by 

TZ  =  S\  x  ■  ■  •  x  Sn.  (47) 

The  intersection  of  the  jointly  satisficing  set  and  the  satis¬ 
ficing  rectangle  yields  the  compromise  set,  comprising  the 
action  profiles  that  are  simultaneously  good  enough  for  the 
group  and  for  each  individual. 

c  =  snn.  (48) 

If  C  f  0,  then  we  may  form  a  best  compromise  as 


S  =  {a  £  A:  W{a)  >  0}.  (51) 

The  individual  selectability  and  rejectability  marginals  are 
computed  according  to  (44)  and  (45),  and  the  individually 
satisficing  sets  Si,  i  =  1 , ...  ,n  are  given  by  (46).  The  in¬ 
tersection  of  the  individually  satisficing  sets  and  the  jointly 
satisficing  set  forms  the  social  compromise  set 

Cg  =  s1r---nsn.  (52) 

If  5  =  0,  then  there  is  no  group  action  that  is  good  enough 
for  the  group  and  each  individual.  However,  by  reducing  the 
q-values  incrementally  as  discussed  above,  a  consensus  will 
eventually  emerge.  The  social  consensus  set  is  the  intersec¬ 
tion  of  the  social  compromise  set  and  the  jointly  satisficing 
set 


=  argmaxkk(a). 


(49)  gs  =  sncs. 


a 


(53) 
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The  best  compromise  is  the  action  in  this  set  that  maximizes 
the  social  welfare  function;  that  is, 

a*  =  arg  max  W  (a) .  (54) 

aGSs 

Example  3  The  Family  Walk.  Suppose  a  family,  consisting 
of  a  father,  mother,  and  child,  is  take  one  of  three  possible 
nature  walk,  denoted  {w,  w',  w"}.  The  father  prefers  long 
hikes,  the  mother  prefers  beautiful  scenery,  and  the  child 
prefers  an  easy  walk. 

The  first  order  of  business  in  framing  this  in  the  satis¬ 
ficing  context  is  to  settle  on  operational  definitions  for  the 
notions  of  selectability  and  rejectability.  From  the  point  of 
view  of  each  individual,  the  main  goal  of  walk  is  enjoyment 
according  to  its  own  criterion.  Thus,  it  is  reasonable  to  as¬ 
sociate  selectability  with  the  degree  of  narrow  self-interest. 
Accordingly,  we  define  the  three  selectability  utilities  in  Ta¬ 
ble  8. 

Table  8  Individual  selectability  utilities. 


X 

USl  ( X ) 

Us2  Or) 

“s3  Or) 

w 

0.1 

0.4 

0.3 

w' 

0.3 

0.4 

0.6 

w" 

0.6 

0.2 

0.1 

As  the  operational  definition  of  rejectability,  we  assume 
that  each  agent  has  a  unit  of  concern  for  the  interests  of  oth¬ 
ers.  Let  us  first  consider  the  mother.  Since  she  has  concern 
for  the  interests  of  her  child,  she  will  encode  this  informa¬ 
tion  in  a  rejectability  function  that  is  conditioned  on  the  se¬ 
lectability  commitment  of  her  child,  as  illustrated  in  Table  9. 
To  interpret  this  table,  consider  the  first  column,  which  cor¬ 
responds  to  mB2|Si  (-\w);  that  is,  the  child  commits  to  select¬ 
ing  w.  Since  this  walk  is  tied  for  the  most  preferred  by  the 
mother,  she  ascribes  no  conditional  rejectability  to  that  al¬ 
ternative,  and  places  all  of  her  conditional  rejectability  mass 
on  w'  and  w"  in  inverse  proportion  to  her  her  selectability. 
Similar  arguments  apply  if  the  child  commits  to  w'  or  w". 

Table  9  Mother’s  conditional  rejectability  ur2\s1  . 


X 

UR^Sl(x\w) 

«b2|Si  {x\w’) 

«R2|Sl  (x\w") 

w 

0.0 

0.4 

0.5 

w' 

0.4 

0.0 

0.5 

w" 

0.6 

0.6 

0.0 

The  father’s  role  in  this  decision  process  is  first  to  defer 
first  to  the  commitments  of  his  child,  then  to  the  commit¬ 
ments  to  his  wife,  and  then,  subject  to  those  constraints,  to 
reject  the  alternative  that  is  least  preferred  in  terms  of  his 
narrow  self-interest.  These  values  are  provided  in  Table  10. 


Table  10  Father’s  conditional  rejectability  Rr2|SiS2- 


x 


w 

w' 

■w" 

^i?3  |  S\  S 2  j  ^  ) 

0 

0 

1 

MR3|SiS2  (x\w,w') 

0 

0 

1 

uR;t\s,s2(x\w,w") 

0 

1 

0 

X 

w 

w' 

w" 

UR3|SiS2  (x\w',w) 

0 

0 

1 

Ur3\SiS2  (x\w',w') 

0 

0 

1 

Rr3  I  Si  s2  (x\w',w") 

1 

0 

0 

X 

w 

w' 

w" 

Ur3  |SjS2  (x\w",w) 

0 

l 

0 

Ur3\S1S2  (x\w" ,w') 

l 

0 

0 

ur3\s1s2{x\w"  ,w") 

l 

0 

0 

Finally,  we  must  specify  the  child’s  rejectability.  This 
rejectability  is  not  conditioned,  since  the  model  does  not  call 
for  the  child’s  preferences  to  be  influenced  by  the  parents’ 
preferences.  Thus,  the  child’s  concern  for  the  interests  of 
others  is  neutral;  that  is,  the  child’s  rejectability  function  is 
uniform,  as  provided  in  (55). 

uRl(w)  =  uRl(w')  =  uRl{w ")  =  i  (55) 

Figure  4  illustrates  the  influence  flows  of  the  satisficing 
praxeic  network  for  the  family  walk. 


Fig.  4  A  satisficing  praxeic  network  for  the  family  walk 

Using  the  values  provided  in  the  above  tables,  we  may 
compute  the  social  welfare  function,  yielding 

W(w)  =  —0.05 
W(w')  =  0.36667 
W(w")  =  -0.052; 

hence  S  =  {it/}. 

We  next  compute  the  individually  satisficing  sets,  yield¬ 
ing 

uSl(w)  -  qiuRl(w)  =  -0.233 
uSl(w')  -  q\uRl (w1)  =  -0.033 
uSl(w")  -  qiuRl(w”)  =  0.267, 
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uS2(w)  -  q2uR2(w)  =  0.14 
uS2(w')  -  q2uR2(w’)  =  0.14 
uS2(w")  -  q2uR2{w")  =  -0.06, 
and 

uS3{w)  -  q3uR3(w )  =  -0.12 
uS3(w')  -  q3uR3(w')  =  0.18 
Us3(w")  -  q3uR3(w")  =  -0.32. 

Thus,  we  have  E\  =  {«/'},  S2  =  {w,  w'},  and  S3  =  {«/}. 
Consequently,  C  =  I7i  fl  S2  n  S3  =  0,  and  the  society  has 
not  reached  a  compromise  that  is  acceptable  to  all  partici¬ 
pants.  However,  if  the  father  reduces  q3  to  0.9,  then 

uSl(w)  -  q3uRl(w)  =  -0.2 
uSl{w')  -  q2uRl(w')  =  0.0 
uSl(w")  -  q3uRl{w”)  =  0.3. 

Hence,  =  {w',  w"},  and  a  consensus  exists  with  Qs  = 
{ w' } .  An  important  feature  of  this  example  is  that  the  fa¬ 
ther  need  only  reduce  its  standards  by  a  small  amount  to 
achieve  a  consensus.  In  terms  of  the  narrow  self-interest  of¬ 
fering  given  by  Table  8,  we  see,  after  taking  into  considera¬ 
tion  the  social  dependencies  that  exist  among  the  individu¬ 
als,  that  the  consensus  alternative  is  best  for  the  mother  and 
the  child  and  second  best  for  the  father. 

7  Conclusions 

Multi-stakeholder  decision  theory,  and  social  robotics  in  par¬ 
ticular,  is  in  need  of  a  mathematical  framework  that  is  de¬ 
signed  to  accommodate  sophisticated  social  behaviors  such 
as  cooperation,  compromise,  negotiation,  and  altruism.  The 
classical  framework  developed  by  the  social  sciences  and 
operations  research  is  based  on  categorical  preference  or¬ 
derings  and  optimization,  and  is  not  sufficiently  general  to 
characterize  these  social  behaviors.  This  research  represents 
a  significant  departure  from  classical  theory  by  incorporat¬ 
ing  three  critical  notions:  conditioning,  coherence,  and  sat¬ 
isficing. 

In  contrast  to  categorical  utilities,  which  are  designed 
to  characterize  self-interest,  conditional  utilities  provide  a 
means  whereby  individuals  may  extend  their  spheres  of  in¬ 
terest  beyond  the  self.  By  modulating  its  preference  struc¬ 
ture  to  account  for  the  preferences  of  others,  an  individual 
may  account  for  sophisticated  social  relationships  such  as 
conditional  altruism,  and  thereby  give  deference  to  others 
without  categorically  redefining  its  preferences. 

In  a  homogeneous  environment  where  decision  makers 
are  required  to  compromise  and  negotiate,  it  is  important  to 
ensure  that  no  agent  can  be  categorically  subjugated.  Co¬ 
herence  is  a  minimal  notion  of  equity  among  the  partici¬ 
pants  that  can  be  ensured  if  and  only  if  the  mathematical 


syntax  of  the  utilities  corresponds  to  probability  mass  func¬ 
tions  (albeit  with  different  semantics).  For  societies  whose 
inter-agent  influence  relationships  can  be  represented  by  a 
directed  acyclic  graph,  coherence  ensures  that  the  edges  are 
conditional  mass  functions,  resulting  in  a  structure  that  is 
mathematically  identical  to  a  Bayesian  network.  This  struc¬ 
ture  permits  individual  utilities  to  be  aggregated  to  form  a 
group  utility  that  accounts  for  social  relationships  between 
individuals,  thereby  providing  a  complete  model  of  the  com¬ 
munity. 

Satisficing,  as  defined  herein,  is  an  approach  to  decision 
making  that  is  as  mathematically  precise  and  formalized  as 
is  the  conventional  notion  of  optimization.  The  essential  ad¬ 
vantage  of  satisficing  is  that  it  readily  extends  to  the  multi¬ 
agent  case,  whereas  optimization  is  intrinsically  a  single¬ 
agent  concept.  Furthermore,  since  satisficing  is  designed  to 
provide  a  set  of  acceptable  solutions  rather  than  a  unique 
best  solution,  it  provides  a  natural  mechanism  with  which  to 
design  a  negotiation  protocol  and  reach  a  compromise. 
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Attitude  Adaptation  in  Satisficing  Games 

Matthew  Nokleby  and  Wynn  Stirling 


Abstract — Satisficing  game  theory  offers  an  alternative  to 
classical  game  theory  that  describes  a  flexible  model  of  players’ 
social  interactions.  Players’  utility  functions  depends  on  other 
players’  attitudes  rather  than  simply  their  actions.  However, 
satisficing  players  with  conflicting  attitudes  may  enact  dysfunc¬ 
tional  behaviors,  resulting  in  poor  performance.  We  present  an 
evolutionary  method  by  which  a  population  of  players  may  adapt 
its  attitudes  to  improve  payoff.  Additionally,  we  extend  the  Nash 
equilibrium  concept  to  satisficing  games,  showing  that  the  method 
presented  leads  players  toward  an  equilibrium  in  their  attitudes. 
We  apply  these  ideas  to  the  Stag  Hunt,  a  simple  game  in  which 
cooperation  does  not  easily  evolve  from  non-cooperation.  The 
evolutionary  method  presented  provides  two  major  contributions. 
First,  satisficing  players  may  improve  their  performance  by 
adapting  their  attitudes.  Second,  numerical  results  demonstrate 
that  cooperation  in  the  Stag  Hunt  can  emerge  much  more  readily 
under  the  method  presented  than  under  traditional  evolutionary 
models. 


I.  Introduction 

Game-theoretic  models  are  often  used  to  construct  societies 
of  artificial  agents.  Commonly,  agents  are  modeled  as  players 
in  a  non-cooperative  game  in  which  players  focus  solely  on  the 
maximization  of  individual  payoff.  The  players’  self-interest 
leads  to  Nash  equilibria  [3],  which  are  strategy  profiles  such 
that  no  single  player  can  improve  its  payoff  by  changing  strate¬ 
gies.  Unfortunately,  self-interested  behavior  places  significant 
limitations  in  terms  of  the  players’  social  interactions.  For 
example,  it  is  often  difficult  to  engender  cooperation  and  other 
social  behaviors  with  self-interested  players.  Indeed,  the  self- 
interest  hypothesis  has  come  under  nearly  continuous  criticism 
since  the  inception  of  game  theory  [4-7]. 

Satisficing  game  theory  [8]  offers  an  alternative  to  non- 
cooperative  game  theory.  It  was  developed  for  the  synthesis  of 
artificial  agents  and  specifically  focuses  on  social  interactions 
between  players.  Players  utilities  are  expressed  as  conditional 
mass  functions,  allowing  them  to  consider  the  preferences  of 
others  rather  than  focusing  solely  on  individual  self-interest. 
Satisficing  models  have  previously  been  successful  in  over¬ 
coming  the  social  hurdles  presented  by  non-cooperative  game 
theory,  allowing  players  to  exhibit  sophisticated  social  behav¬ 
iors  such  as  altruism,  negotiation,  and  compromise  [9].  How¬ 
ever,  satisficing  theory  presents  its  own  set  of  challenges.  As  in 
real-life  social  situations,  satisficing  communities  may  behave 
dysfunctionally.  When  players  with  incompatible  attitudes  are 
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grouped  together,  they  can  choose  incoherent  behaviors  that 
lead  to  poor  performance. 

The  Stag  Hunt,  a  simple  game  originally  suggested  by 
Rousseau  [10],  underscores  the  difficulty  of  achieving  coop¬ 
eration  under  self-interest.  As  usually  formalized,  the  game 
involves  two  hunters.  They  can  catch  a  stag  only  if  they 
hunt  stag  together,  but  each  can  catch  a  (much  smaller)  hare 
separately.  That  is,  a  player  earns  maximum  payoff  if  both 
players  cooperate,  but  risks  failure  if  it  attempts  to  cooperate 
while  the  other  does  not.  Since  each  player  must  individually 
decide  between  cooperation  and  non-cooperation,  it  represents 
a  useful  model  for  the  analysis  of  potentially  cooperative 
behavior.  For  example,  a  group  of  workers  choosing  whether  to 
strike  loosely  fall  under  the  Stag  Hunt  model:  a  large  number 
of  workers  may  achieve  a  significant  benefit  by  striking,  while 
a  single  worker  who  “strikes”  alone  incurs  significant  loss. 

Social  dilemmas  such  as  the  Stag  Hunt  have  been  studied 
extensively  by  (among  others)  social  scientists,  economists, 
and  biologists.  A  large  body  of  recent  work  focuses  on 
learning-based  [11-13]  and  evolutionary  [14—16]  methods  for 
achieving  cooperation.  In  evolutionary  game  theory,  pioneered 
by  Maynard  Smith  [17, 18],  populations  of  players  make 
decisions  by  trial-and-error  rather  than  by  explicit  utility 
maximization.  Over  time,  natural  selection  favors  individuals 
who  earn  higher  payoff,  altering  the  population’s  makeup. 
Large,  well-mixed  populations  are  described  by  the  replicator 
dynamics  [19],  which  defines  a  system  of  ordinary  differential 
equations  governing  the  evolution  of  the  population.  Under 
suitable  conditions,  the  replicator  dynamics  drives  the  popu¬ 
lation  to  a  Nash  equilibrium. 

The  Stag  Hunt  presents  considerable  difficulties  from  an 
evolutionary  perspective.  Under  the  standard  replicator  dynam¬ 
ics,  a  population  composed  primarily  of  hare  hunters  cannot 
evolve  into  a  group  of  stag  hunters,  even  though  each  player 
benefits  from  cooperation.  Skyrms  posits  a  compelling  reason 
for  this  failure:  “for  the  Hare  Hunters  to  decide  to  be  Stag 
Hunters,  each  must  change  her  beliefs  about  what  the  others 
will  do.  But  rational  choice  based  on  game  theory  as  usually 
conceived,  has  nothing  to  say  about  how  or  why  such  a  change 
might  take  place”  [20,  emphasis  in  the  original]. 

Motivated  by  Skyrms’  conjecture,  we  explore  methods  by 
which  “such  a  change”  may  take  place  in  satisficing  game 
theory.  To  do  so,  we  attempt  to  bridge  the  gap  between  non- 
cooperative  and  satisficing  game  theory  by  incorporating  ele¬ 
ments  of  non-cooperative  game  theory  into  satisficing  theory. 
In  a  manner  similar  to  [21,22],  we  present  a  method  whereby 
a  population  of  players  may  modify  its  attitudes  according 
to  the  game  structure  and  the  attitudes  of  other  players.  In 
our  method,  which  employs  the  standard  replicator  dynamics, 
players  whose  attitudes  result  in  higher  payoffs  reproduce 
more  readily,  causing  their  attitudes  to  dominate  the  popula- 
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tion.  The  resulting  model  blends  the  two  decision  theories: 
players  retain  the  conditional  utility  structure  of  satisficing 
theory  while  improving  payoff  by  evolutionary  means.  The 
dynamics  leads  the  players  toward  a  Nash  equilibrium  in 
players’  attitudes  rather  than  in  their  actions. 

In  Section  II  we  familiarize  the  reader  with  the  basics  of 
satisficing  game  theory.  In  Section  III  we  review  the  classical 
formulation  of  the  Stag  Hunt  and  its  evolutionary  difficulties. 
We  present  a  satisficing  model  for  the  Stag  Hunt  in  Section 
IV.  In  Section  V  we  define  the  attitude  equilibrium  and 
present  the  attitude  dynamics.  We  present  experimental  results 
in  Section  VI  and  compare  the  satisficing  approach  to  other 
recent  methods  in  evolutionary  game  theory.  We  give  our 
conclusions  in  Section  VII. 

II.  Satisficing  Game  Theory 

While  the  simple  and  seemingly  reasonable  assumption  of 
self-interest — also  called  individual  rationality — has  given  rise 
to  a  rich  and  successful  theory  of  games,  narrow  maximization 
may  be  too  simple,  particularly  in  describing  social  situations. 
As  observed  by  Luce  and  Raiffa,  “general  game  theory  seems 
to  be  in  part  a  sociological  theory  which  does  not  include  any 
sociological  assumptions. . .  it  may  be  too  much  to  ask  that  any 
sociology  be  derived  from  the  single  assumption  of  individual 
rationality”  [4,  p.  196].  Satisficing  game  theory  provides  an 
alternative  to  the  classical  framework.  It  presents  a  more 
elaborate  structure  which  may  be  more  useful  in  modeling 
social  behaviors.  Players  may  directly  concern  themselves  with 
the  preferences  of  others,  rather  than  explicitly  attempting  to 
maximize  utility. 

We  construct  the  satisficing  framework  by  altering  the 
structure  of  the  players’  utility  functions.  First,  each  player 
possesses  two  utilities:  one  to  characterize  the  benefits  associ¬ 
ated  with  taking  an  action  and  one  to  characterize  the  costs. 
A  satisficing  player  contents  itself  with  a  decision  for  which 
the  benefits  outweigh  the  costs  is  “good  enough”  or  satisfic¬ 
ing.1  Second,  the  players’  utility  functions  share  a  common 
syntax  with  probability  mass  functions,  allowing  probabilistic 
concepts  such  as  conditioning  and  independence  to  be  applied 
to  players’  preferences — albeit  with  a  significantly  different 
interpretation. 

The  use  of  probability  mass  functions  to  describe  a  player’s 
preferences  rather  than  a  random  phenomenon  is  an  unusual 
one,  and  warrants  further  explanation.  A  rigorous  justification 
is  given  in  [24],  where  it  is  shown  that  the  use  of  mass 
functions  as  utilities  guarantees  several  useful  social  prop¬ 
erties  regarding  the  reconciliation  of  group  and  individual 
preferences.  Fortunately,  however,  the  benefits  of  conditional 
utilities  may  also  be  appreciated  intuitively.  For  two  discrete 
random  phenomena  X  and  Y,  where  Y  is  dependent  on  X, 
we  can  express  the  probabilities  for  Y  by  the  conditional 
mass  function  pY]x(y\x).  The  conditional  mass  function  gives 
hypothetical  probabilities  of  Y:  what  would  be  the  probability 

'Although  they  share  similarities,  satisficing  game  theory  should  not  be 
confused  with  the  concept  of  “bounded  rationality”  satisficing  introduced  by 
Simon  [23].  With  satisficing  ti  la  Simon,  individuals  search  for  sub-optimal 
choices  that  meet  a  variable  threshold  or  aspiration  level,  implicitly  accounting 
for  the  cost  of  continued  searching. 


that  Y  =  y  if  we  knew  that  X  took  on  some  value  x7  If 
we  know  the  probabilities  for  X  =  x,  we  can  compute  the 
marginal  mass  function  according  to  basic  rules  of  probability 
theory:  pY(y)  =  Y,xPv\x(y\x)Px(x).  The  marginal  probabil¬ 
ities  for  V  are  influenced — but  not  entirely  dictated — by  the 
probabilities  of  X. 

Similarly,  players’  preferences  may  depend  upon  the  pref¬ 
erences  of  others,  allowing  their  utilities  (which  we  call  social 
utilities )  to  be  expressed  as  conditional  mass  functions.  The 
conditional  mass  functions  allow  for  hypothetical  expressions 
of  utility:  what  would  player  l’s  utilities  be  if  player  2 
unilaterally  preferred  a  particular  action?  We  can  compute 
player  l’s  marginal  utilities — which  are  the  utilities  used  for 
decision-making — by  summing  the  conditional  utilities  over 
player  2’s  actual  preferences.  This  structure  allows  players  to 
consider  not  simply  what  actions  other  players  may  prefer, 
but  how  strong  the  preferences  for  action  are.  Their  utilities 
are  influenced  by  others’  preferences  in  a  controlled  manner 
which  does  not  require  that  they  discard  their  own  preferences. 

A.  Formalization 

First,  define  the  set  of  players  X  =  {1,2,  •••  ,n}.  Each 
player  chooses  a  pure  strategy  Uj  £  Ui ,  where  U,j  is  player 
i’s  pure-strategy  set.  A  pure-strategy  profile,  which  describes 
the  actions  of  all  of  the  players,  is  an  n-dimensional  vector 
u  €  U,  where  U  =  Ui  x  U2  x  •  •  •  x  Un  is  the  pure-strategy 
space. 

As  mentioned  in  the  previous  subsection,  each  player 
possesses  two  social  utilities.  To  describe  these,  we  define 
two  “selves”  or  perspectives  from  which  each  player  may 
consider  its  actions  [25].  The  selecting  self  considers  actions 
strictly  in  terms  of  their  associated  benefits,  while  the  rejecting 
self  considers  actions  only  in  terms  of  the  costs  incurred  in 
implementing  them.  These  selves  are  described  by  the  se- 
lectability  function  ps.(ui)  and  rejectability  function  pRi{uf), 
respectively. 

Since  social  utilities  are  mass  functions,  they  are  normalized 
across  the  pure-strategy  sets  and  therefore  describe  the  relative 
benefits  and  costs  associated  with  a  pure  strategy  in  Ui.  They 
also  provide  players  with  a  formal  definition  of  “good  enough.” 
A  pure  strategy  is  “good  enough,”  or  satisficing,  if  the  relative 
benefits  are  at  least  as  great  as  the  relative  costs.  In  the 
vernacular,  we  may  view  satisficing  as  “getting  one’s  money 
worth,”  as  opposed  to  optimization,  where  players  seek  “the 
best  and  only  the  best.”  While  the  former  concept  allows  for 
a  set  of  multiple  actions  that  are  “good  enough,”  the  latter  is 
designed  to  produce  a  unique  solution.  We  therefore  define  the 
individually  satisficing  set  for  player  i  as 

Y,i  =  {ueUi.  ps.(u)>qpR.{u)},  (1) 

where  q  is  the  index  of  caution.  Typically,  q  =  1,  but 
we  may  adjust  a  player’s  definition  of  “good  enough”  by 
changing  q.  Setting  q  <  1  ensures  that  E,  is  not  empty. 
We  may  combine  the  players’  individually  satisficing  sets  by 
forming  the  satisficing  rectangle  5fi2...„,  which  is  defined  as 
the  Cartesian  product 

^12  -ra  =Si  X  S2  X  X  (2) 
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The  satisficing  rectangle  is  the  set  of  all  strategy  profiles  that 
are  simultaneously  satisficing  to  each  player. 

It  is  convenient  to  express  the  relationship  between  play¬ 
ers’  utilities  graphically.  In  probability  theory,  relationships 
between  random  variables  are  expressed  in  Bayesian  networks 
[26],  Similarly,  in  satisficing  theory  the  relationship  between 
players’  utilities  are  expressed  in  praxeic  networks ?  The  prax- 
eic  network  consists  of  a  directed  acyclic  graph  (DAG),  where 
the  nodes  are  the  selecting  and  rejecting  perspectives  of  each 
player  and  the  edges  are  the  conditional  utility  functions.  For 
example,  consider  the  simple  two-player  community  depicted 
in  Figure  1 .  For  each  player,  the  rejecting  preferences  depend 
on  the  selecting  preferences  of  the  other  player,  while  the 
selecting  preferences  are  independent. 


Parenthetically,  we  note  that  praxeic  networks  also  resemble 
the  spatial  evolutionary  models  discussed  in  [15, 16, 28-30].  In 
these  models,  graphical  connections  determine  which  players 
can  interact  during  play.  That  is,  individuals  may  only  play 
with  players  to  whom  they  are  connected.  In  contrast,  graphi¬ 
cal  connections  in  praxeic  networks  define  how  players  influ¬ 
ence  each  other  in  play.  Both  models  describe,  in  some  sense, 
players’  social  relationships.  But,  while  spatial  evolutionary 
models  describe  which  players  can  pair  up  in  a  game,  praxeic 
networks  describe  which  players’  utilities  can  influence  the 
utilities  of  others. 

In  discussing  the  players’  social  utilities,  we  retain  the  ter¬ 
minology  of  probability  theory.  In  the  community  from  Figure 
1,  we  refer  to  player  l’s  conditional  rejectability  function, 
denoted  pRl]S2(vi\u2).  As  mentioned  above,  the  conditional 
mass  function  expresses  a  hypothetical  proposition,  where  the 
antecedent  is  the  strategy  favored  by  player  2,  and  the  conse¬ 
quent  is  the  utility  of  player  1.  That  is,  if  player  2’s  selecting 
preferences  entirely  favored  strategy  U2 ,  what  would  be  player 
l’s  rejectability  for  v\ ?  As  with  probability  mass  functions,  we 
may  compute  the  marginal  rejectability  by  summing  over  the 
conditionals:  pRl(v i)  =  YhU2  eu2_PRi\s2(vi\u2)Ps2{u2)-  The 
marginal  utilities  determine  the  individually  satisficing  sets 
and  the  satisficing  rectangle.  If  a  utility  is  independent  (such 
as  the  selectability  functions  in  this  example),  its  marginal  is 
expressed  directly,  without  conditioning. 

By  allowing  conditioning  in  the  players’  utilities,  we  im¬ 
plicitly  assume  that  players  have  at  least  partial  knowledge 
of  each  other’s  utilities.  Each  player  must  have  sufficient 
knowledge  of  other  players’  utilities  in  order  to  compute  its 
marginal  utilities  and  find  its  individually  satisficing  set.  In  the 
example  community,  each  player  must  know  the  other  player’s 

2The  term  praxeic  is  derived  from  praxeology ,  which  refers  to  “the  science 
of  human  conduct”  or  "the  science  of  efficient  action.”  [27] 


selectability  function  in  order  to  compute  its  own  rejectability. 
However,  since  players  do  not  consider  each  other’s  actions 
in  determining  the  individually  satisficing  sets,  they  need  not 
observe  (or  predict)  each  other’s  choices. 

With  the  marginal  and  conditional  utilities  defined  for 
the  example  community,  we  can  form  the  interdependence 
function  pSl  ■snR1  rAu li'"  ,un,vi,---  ,vn),  which  is  the 
joint  mass  function  of  all  players’  selecting  and  rejecting  pref¬ 
erences.  By  the  chain  rule  of  probability  theory,  the  interdepen¬ 
dence  function  for  this  example  is  Ps1s2r1r2(ui,U2,  fi,  V2)  = 

PRl  is2  (ft \U2)PR2  |Sl  {v2 1 «i )pSl  («i  )pSl  (ui ) ■ 

Satisficing  games  are  characterized  by  the  triple 
(X,  U,pSl...SnBl...Kn),  where  X  is  the  set  of  players,  U  is  the 
pure-strategy  space,  and  pSl...SnHl...Hn  is  the  interdependence 
function.  From  this  information,  all  necessary  marginal 
utilities  can  be  computed  and  the  satisficing  rectangle  can  be 
determined. 

Finally,  it  is  often  useful  to  specify  the  players’  social 
utilities  in  terms  of  variable  parameters,  which  we  refer  to 
as  the  players’  attitudes.  The  interpretation  of  the  attitudes, 
of  course,  depends  on  the  specific  game  being  played,  but  in 
general  they  express  each  player’s  temperament,  which  affects 
the  degree  to  which  its  utilities  depend  on  those  of  other 
players.  For  example,  in  the  Stag  Hunt,  the  players’  attitudes 
will  characterize  their  aversion  to  risk,  which  influences  each 
player’s  willingness  to  engage  in  stag-hunting. 

B.  Random  Satisficing  Games 

Often,  a  player’s  utility  will  depend  on  random  phenomena, 
resulting  in  expected  utilities  based  on  the  distribution  of  the 
random  event.  With  classical  game  theory,  it  is  required  that 
the  probabilistic  distributions  of  the  random  phenomena  not  be 
influenced  by  the  preferences  of  the  players.  In  other  words,  a 
player’s  belief  regarding  a  random  event  may  affect  its  utilities, 
but  not  vice  versa.  In  most  cases  this  restriction  poses  no 
difficulty.  However,  we  may  want  to  consider  circumstances  in 
which  a  player’s  subjective  probability  about  an  event  depends 
on  players’  preferences. 

The  conditional  structure  of  social  utilities  provides  for  such 
a  possibility.  Since  the  utilities  are  mass  functions,  we  can 
combine  both  probabilistic  and  preferential  information  into  a 
single  model.  Figure  2  illustrates  a  network  implementing  such 
a  model.  This  praxeic  network  is  similar  to  Figure  1  in  that 
it  contains  the  same  four  vertices  associated  with  the  players’ 
selecting  and  rejecting  selves.  However,  we  also  include  two 
random  variables  9\  and  ()■>.  which  represent  phenomena  that 
are  known  to  the  players  only  probabilistically.  This  network 
describes  both  players  whose  preferences  depend  on  random 
phenomena  and  random  phenomena  which  depend  on  players’ 
preferences.  The  dependencies  from  Figure  1  still  persist.  Ri 
still  depends — albeit  indirectly,  through  62 — on  S2,  and  If 
still  depends  on  Si,  which  now  depends  ()\ . 

III.  The  Stag  Hunt 

In  the  Stag  Hunt,  players  choose  between  two  pure  strate¬ 
gies:  hunt  stag  or  hunt  hare,  denoted  s  and  h,  respectively. 
The  payoff  for  playing  each  pure  strategy  depends  on  the 
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Fig.  2.  An  praxeic  network  with  “true”  random  variables. 


action  of  the  other  player.  If  the  other  player  hunts  stag, 
the  payoff  for  hunting  stag  is  higher  than  that  of  hunting 
hare.  However,  if  the  other  player  hunts  hare,  stag  hunting 
yields  a  low  payoff.  That  is,  the  players  must  hunt  together 
to  catch  the  stag  and  obtain  the  higher  payoff.  The  payoff  for 
hunting  hare,  on  the  other  hand,  is  independent  of  the  other 
player’s  choice.  Each  player  can  individually  catch  a  hare,  and 
therefore  can  always  opt  for  the  modest — but  more  secure — 
payoff  associated  with  consuming  a  hare.  We  quantitatively 
express  the  players’  utilities  in  the  payoff  matrix  of  Table  I. 

TABLE  I 

Payoff  matrix  for  a  two-player  Stag  Hunt. 

Player  2 

Player  Is  h 

s  (4,  4)  (0,  3) 

h  (3,  0)  (3,  3) 

There  are  two  pure-strategy  Nash  equilibria  for  the  Stag 
Hunt:  (s,s)  and  ( h,h ).  If  the  players  simultaneously  hunt 
stag  or  hare,  there  is  no  incentive  for  either  player  to  change 
actions.  There  is  also  a  mixed-strategy  equilibrium,  in  which 
each  player  invokes  a  randomized  rule  to  choose  between  the 
two  pure  strategies.  We  will  study  the  mixed-strategy  equi¬ 
librium  in  more  detail  later.  Each  pure-strategy  equilibrium 
has  its  benefits.  The  (s,  s)  equilibrium  is  optimal  in  that  it 
maximizes  both  players’  payoffs.  However,  since  successful 
stag  hunting  requires  the  cooperation  of  the  other  player,  risk- 
averse  players  may  choose  instead  to  hunt  hare.  The  (h,  h) 
equilibrium  is  regarded  as  the  risk-dominant  equilibrium  in  the 
sense  that  the  potential  gains  of  deviating  from  hare  hunting 
are  less  than  the  potential  losses:  at  best,  a  hare  hunter  will 
increase  its  utility  by  one  by  switching  to  stag,  but  at  worst,  it 
will  decrease  its  utility  by  three.  Thus,  conservative — yet  fully 
rational — players  might  choose  to  hunt  hare. 

This  dichotomy  illustrates  the  fundamental  issue  of  the  Stag 
Hunt.  Obviously,  if  each  player  had  certain  assurance  that 
the  other  player  would  hunt  stag,  everyone  would  cooperate.3 
However,  players  do  not  have  such  an  assurance  under  the 
usual  model,  but  must  choose  their  actions  independently.  The 
players’  actions  then  boil  down  to  how  much  confidence  each 

3  Interestingly,  it  is  straightforward  to  show  that  if  the  game  is  played 
sequentially  (i.e.  player  1  makes  its  move,  and  then  player  2 — who  observes 
player  1  's  choice — moves),  mutual  stag-hunting  becomes  the  unique  subgame 
perfect  Nash  equilibrium.  [31] 


player  has  in  the  other’s  willingness  to  cooperate  and  how 
risk-averse  each  player  is.  As  mentioned  by  Skyrms,  classical 
game  theory  has  little  to  say  about  this  topic.  Indeed,  the  Nash 
equilibria  do  not  tell  us  which  actions  the  players  will  take. 
They  simply  imply  that  once  a  pair  of  players  is  in  either  of  the 
pure-strategy  equilibria,  neither  player  will  have  incentive  to 
deviate.  To  study  which  equilibrium  will  result  under  different 
circumstances,  we  turn  to  evolutionary  game  theory  [32,33], 

A.  The  Replicator  Dynamics 

The  replicator  dynamics  is  the  classic  instantiation  of  evolu¬ 
tionary  game  theory.  It  models  the  evolution  of  a  population’s 
strategies  according  to  their  ecological  fitness.  Consider  a 
large  population  of  players  who  are  “programmed”  to  play  a 
particular  strategy — regardless  of  the  other  player’s  behavior — 
in  a  symmetric  two-player  game  such  as  the  Stag  Hunt.  The 
players  are  randomly  paired  up  to  play  the  game  at  each 
time  step.  Each  player  reproduces  asexually4  according  to 
its  payoffs;  that  is,  the  number  of  offspring  that  a  player 
has  is  proportional  to  its  payoff  during  the  previous  game. 
Players’  strategies  also  “breed  true,”  meaning  that  offspring 
are  programmed  to  the  same  pure  strategy  as  their  parents.  We 
assume  that  the  population  is  well-mixed,  giving  each  player 
an  equal  chance  of  being  paired  with  any  other  player. 

For  a  symmetric,  two-player  game  where  each  player  must 
choose  some  strategy  in  the  pure-strategy  set  U,  define  the 
mixed-strategy  simplex  Ajj  as  the  set  of  all  mixed  (random¬ 
ized)  strategies  over  U.  If  U  contains  m  elements,  we  can 
characterize  a  mixed  strategy  as  a  nonnegative  n-dimensional 
vector  x  that  obeys  the  constraint  i  xi  =  1-  Each  player’s 
mixed  strategy  is  probabilistically  independent  of  the  other 
player’s.  The  interior  of  A u  is  the  set  of  mixed  strategies 
which  assign  nonzero  probability  to  each  pure  strategy: 

int(A[/)  =  {x  €  Ajj  :  Xi  >  0,  i  €  {1 . . .  m}}. 

In  the  replicator  dynamics,  we  interpret  each  element  Xi  as  the 
population  share,  or  fraction  of  the  population,  playing  pure 
strategy  i.  That  is,  if  we  randomly  draw  an  individual  from 
the  population  described  by  x,  the  probability  that  it  will  be 
programmed  to  play  i  is  x^.  At  time  t,  the  expected  utility5  of 
a  player  who  plays  pure  strategy  i  against  a  random  member 
of  the  population  is  u(i,x(t))  =  7r(^  where 

7 t(i,j)  represents  the  utility  of  playing  pure  strategy  i  against 
pure  strategy  j.  As  the  players  reproduce,  the  population  shares 
described  by  x(t)  vary,  and  the  more  successful  strategies 
tend  to  dominate  over  those  which  are  poorly-adapted  to 
the  evolving  community.  As  the  population  size  approaches 
infinity  we  may  invoke  the  law  of  large  numbers,  and  the 
dynamics  of  the  population  shares  becomes  a  system  of  m 
differential  equations: 

Xi(t)  =  [u(f,x(f))  -  tt(x(f),x(f))]ij(f),  j  £  {1 . .  ,m},  (3) 

4This  does  not  contradict  the  fact  that  the  players  must  pair  off  to  play  the 
game.  While  they  do  play  the  game  pairwise,  each  player  earns  its  payoff 
individually.  The  number  of  offspring  it  produces  is  proportional  only  to  its 
own  payoff,  and  is  entirely  independent  of  the  other  player’s. 

5  We  use  7 r  to  represent  the  utility  (or  payoff)  for  when  players  use  only 
pure  strategies,  while  u  represents  the  expected  utility  when  mixed  strategies 
are  involved. 
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where  u(x(t),x(t))  is  the  population’s  average  expected  util¬ 
ity, 

m 

u(x(f),x(i))  =  u(i,x(t))xi(t) 

1=1 
m  m 

=  J2J27T(i,j)xi(t)xj{t). 

*=1  i=i 

Intuitively,  (3)  tells  us  that  a  pure  strategy’s  population  share 
increases  at  time  t  if  its  expected  utility  is  higher  than  the 
average  expected  utility  across  the  population.  It  is  shown 
in  [32]  that,  if  the  initial  conditions  satisfy  x(0)  G  int(A[/) 
(all  pure  strategies  are  represented  in  the  initial  conditions), 
any  steady  state  of  the  dynamics  is  a  Nash  equilibrium  in  the 
players’  strategies. 

It  should  be  noted  that  the  standard  replicator  model  de¬ 
scribes  a  selection  dynamics  rather  than  a  mutation  dynamics. 
Players  do  not  change  strategies  under  this  model;  instead, 
the  offspring  of  players  whose  strategies  are  suboptimal  are 
overwhelmed  by  the  offspring  of  more  successful  players. 
As  time  continues,  the  fraction  of  the  population  playing 
suboptimal  strategies  becomes  arbitrarily  small. 

To  account  for  random  factors  such  as  mutation,  migration, 
and  payoff  fluctuations,  several  stochastic  replicator  models 
have  been  proposed  [13, 14,  34, 35],  We  examine  a  model  from 
[14],  which  augments  the  standard  replicator  dynamics  by 
introducing  fixed  mutation  probabilities  into  the  dynamics.  The 
mutation  probabilities  are  contained  in  the  matrix  W  =  [Wij], 
where  WV)  represents  the  probability  that  an  individual  playing 
strategy  j  spontaneously  switches  to  strategy  i.  The  mutation 
dynamics  differs  from  (3)  by  the  addition  of  a  mutation  term: 

Xi(t)  =  \u(i, x(t))  -  u(x(f),x(f))]atj(f) 

m 

+  ~  w3ixi(t))-  (4) 

i= i 

The  dynamics  for  x.i  are  altered  by  adding  the  rate  at  which 
players  mutate  into  the  population  share  Xi  (described  by 

■  WijXj)  and  subtracting  the  rate  at  which  players  mutate 
out  of  the  population  share  xt  (described  by  W)tXj).  When 
mutation  probabilities  are  zero  (W  =  I),  (4)  collapses  to 
the  standard  replicator  dynamics.  In  general,  however,  we  are 
forced  to  give  up  the  theoretical  properties  guaranteed  under 
the  standard  replicator  model.  The  steady-state  behavior  of  the 
system  no  longer  corresponds  to  Nash  equilibria,  regardless  of 
initial  conditions. 

B.  Stag  Hunt  Replicator  Dynamics 

1 )  Standard  Dynamics:  For  the  Stag  Hunt,  the  population 
is  described  by  the  two-dimensional  vector  x  =  (xs,Xh). 
The  payoff  matrix  (Table  I)  shows  that  the  payoff  for  a  stag 
hunter  is  four  when  paired  with  another  stag  hunter,  and  zero 
when  paired  with  a  hare  hunter.  A  stag  hunter  therefore  gains 
an  expected  utility  of  tt(s,x)  =  4xs.  Since  the  utility  for 
hunting  hare  is  independent  of  the  other  player’s  actions, 
u(h,  x)  =  3.  The  population’s  average  expected  payoff  is 
given  by  u(x,  x)  =  4x 2  —  3xs  +  3.  Since  a;s  =  1  —  Xh,  we  can 


characterize  the  dynamics  by  examining  only  the  stag  hunting 
share.  Suppressing  time  arguments,  we  get 

xs  =  [u(s,  x)  —  u(x,  x)]  xs  =  — 4x1  +  7*1  ~  3xs-  (5) 

While  the  nonlinearities  prevent  a  closed-form  solution,  we 
can  easily  examine  the  qualitative  behavior  of  the  population. 
In  Figure  3,  we  show  a  direction  field  for  the  replicator 
dynamics,  which  gives  the  sign  of  the  derivative  as  a  function 
of  xs.  The  stationary  points,  where  xs  =  0,  occur  at  xs  = 
{0,3/4, 1}.  The  point  at  xs  =  3/4  corresponds  to  the  mixed- 
strategy  Nash  equilibrium  discussed  previously.  However,  the 
mixed-strategy  equilibrium  is  not  stable;  any  deviation  drives 
the  dynamics  to  one  of  the  pure-strategy  points,  which  are 
asymptotically  stable.  We  may  regard  xs  =  3/4  as  a  boundary 
for  the  initial  conditions  of  the  population:  if  fewer  than  75% 
of  the  population  initially  hunt  stag,  the  dynamics  quickly 
drives  stag  hunters  to  relative  extinction.  If  more  than  75% 
initially  hunt  stag,  hare  hunters  die  out.  Although  stag  hunting 
prevails  in  a  predominantly  cooperative  society,  these  dynam¬ 
ics  cannot  evolve  cooperation  from  an  initially  non-cooperative 
population. 
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Fig.  3.  Direction  field  for  Stag  Hunt  replicator  dynamics. 

2)  Mutation  Dynamics:  Using  the  replicator  model  in  (4), 
we  add  a  probability  of  mutation  into  the  Stag  Hunt  dynamics 
in  the  hope  that  mutation  may  help  evolve  a  cooperative 
population.  We  assume  that  the  probability  of  mutating  from 
stag  to  hare  is  identical  to  the  probability  of  mutation  from 
hare  to  stag.  Consequently,  we  can  parameterize  the  mutation 
matrix  by  a  single  mutation  probability  0  <  a  <  1: 


The  dynamics  for  xs  becomes 

xs  =  -4a/!  +  7x2s  -  3xs  +  Wsh(l  -  xs)  -  Whsxs  (6) 
=  -4x^  +  7x2  -  3xs  +  a(l  -  2xs).  (7) 

The  closed-form  expression  for  the  stationary  points  of  the 
dynamics  is  quite  unwieldy,  so  in  Figure  4  we  plot  the  direc¬ 
tion  field  for  the  dynamics  as  a  function  of  a  and  xs.  When 
mutation  probabilities  are  small,  the  qualitative  behavior  of  the 
solution  does  not  change:  there  remain  two  stable  stationary 
points  at  which  nearly  all  of  the  population  hunts  either  stag 
or  hare  and  an  unstable  stationary  point  which  defines  the 
boundary  between  the  stag-hunting  and  hare-hunting  basins 
of  attraction.  The  boundary  point  increases  with  the  mutation 
rate,  suggesting  that  mutation  exacerbates  the  evolutionary 
difficulties  of  the  Stag  Hunt. 

For  large  mutation  probabilities,  the  dynamics  differs  con¬ 
siderably,  leaving  a  single  stationary  point  to  which  the  dy¬ 
namics  converges  independent  of  initial  conditions.  Even  with 
absurdly  high  mutation  rates — in  which  evolution  is  governed 
more  by  mutation  than  by  payoff — only  a  minority  of  the 
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population  hunts  stag.  Of  course,  since  the  population  size  is 
infinite,  the  mutation  replicator  model  defines  a  deterministic 
system  as  in  the  standard  dynamics.  Resultantly,  finite  popula¬ 
tions,  with  random  pairings  and  mutation,  may  spontaneously 
evolve  cooperation  from  non-cooperation.  But  the  moral  of 
the  story  is  that,  on  average,  even  finite  populations  rarely 
cooperate  if  they  are  large,  well-mixed,  and  composed  of 
players  that  are  pre-programmed  to  play  a  particular  pure- 
strategy. 
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cost  of  an  action,  tempered  by  risk-aversion.  The  opportunity 
cost  of  hunting  hare  is  the  payoff  for  catching  a  stag,  and  the 
opportunity  cost  of  hunting  stag  is  the  payoff  for  catching  a 
hare. 

Next,  we  define  the  interconnections  between  the  four  selves 
and  form  the  praxeic  network.  Our  model  is  illustrated  in 
Figure  5.  In  addition  to  the  vertices  corresponding  to  the 
selecting  and  rejecting  selves,  we  include  a  vertex  which 
corresponds  to  a  binary  random  variable  9S  which  accounts 
for  the  possibility  of  failure.  It  is  not  necessarily  certain,  even 
if  both  players  hunt  stag,  that  they  will  succeed.  We  use  9S  =  1 
to  denote  that  a  successful  stag  hunt  is  possible  and  9S  =  0  to 
denote  that  stag  hunting  will  result  in  failure. 


Fig.  5.  The  praxeic  network  for  the  Stag  Hunt. 


Fig.  4.  Direction  field  for  Stag  Hunt  stochastic  replicator  dynamics. 

Finally,  as  we  have  already  discussed,  there  exist  other  evo¬ 
lutionary  models  than  the  replicator  dynamics.  In  Section  VI, 
we  investigate  the  effects  of  more  sophisticated  evolutionary 
mechanisms  on  the  Stag  Hunt.  For  the  time  being,  however, 
we  focus  on  the  underlying  structure  of  the  players’  behavior. 
Our  solution,  based  upon  satisficing  game  theory,  affords  a 
flexible  structure  for  players’  social  interactions,  increasing 
the  possibility  for  cooperation  even  under  simple  evolutionary 
dynamics. 


IV.  The  Satisficing  Stag  Hunt 

In  a  two-player  Stag  Hunt,  the  set  of  players  is  X  = 
{1,2},  and  each  player  has  an  identical  pure-strategy  set 
Ui  =  {s,h},i  e  X.  In  formulating  a  satisficing  game,  we 
are  free  to  select  an  arbitrary  structure  for  the  praxeic  network 
and  specify  the  conditional  utilities  as  we  see  fit.  We  are  then 
constrained  to  carry  out  the  rules  of  probability  in  computing 
the  marginal  utilities  which  determine  the  players’  behavior. 
Thus,  the  formulation  of  a  satisficing  game  is  a  process  of 
“designing”  the  conditional  structure  and  examining  the  results 
to  see  if  the  players’  behavior  makes  sense. 

First,  we  give  conceptual  definitions  for  the  selectability 
and  rejectability  preferences,  which  we  will  further  clarify  as 
we  mathematically  define  the  players’  social  utilities.  What 
do  we  mean  by  “benefits”  and  “costs”  for  the  players  in 
the  Stag  Hunt?  In  our  treatment,  we  consider  selectability 
in  terms  of  successful  cooperation.  To  the  extent  to  which 
stag  hunting  can  be  successful,  the  selecting  self  prefers  to 
hunt  stag.  We  associate  rejectability  with  the  raw  opportunity 


To  define  the  rejectability  function  for  each  agent,  we  first 
must  define  a  normalized  measure  of  opportunity  cost.  Let  <j>Si 
and  <ph .  denote  the  raw  utility  (in  arbitrary  units)  of  consuming 
stag  and  hare,  respectively.  Normalizing,  the  relative  utility  of 
hare-hunting  becomes  p[  =  <j>h+<f>  f°r  *  =  1)  2.  The  relative 
utility  of  stag-hunting  is  then  1  —  /4- 

Given  this  definition,  we  may  let  <f>Si  =  4  and  (t>hr  =  3, 
the  payoff  values  given  in  Table  I,  resulting  in  //'  I . 
However,  we  further  wish  to  take  into  account  the  temperment 
of  the  players.  As  discussed  in  Section  III,  a  central  issue  in 
the  Stag  Hunt  is  to  determine  what  players  of  differing  risk- 
aversion  levels  should  do.  Therefore,  we  introduce  a  parameter. 
Pi ,  which  expresses  the  degree  of  player  i’s  risk  aversion. 
A  player  with  pi  =  1  is  risk-neutral,  a  player  with  pi  >  1 
is  risk-averse,  and  a  player  with  pi  <  1  is  payoff-seeking 
and  tends  to  ignore  risk.  We  then  define  pt  =  p,;  — — ^ — . 
Thus,  Pi  reflects  a  player’s  willingness  to  take  risks  as  well  as 
the  relative  utility  for  stag  and  hare.  A  maximally  risk-averse 
player  will  hunt  stag  only  if  success  is  certain,  while  a  fully 
payoff-seeking  player  will  hunt  stag  regardless  of  the  odds.  To 
ensure  a  meaningful  game,  we  still  require  that  both  players 
will  never  prefer  hare  to  stag,  or  pi  <  }  for  i  =  1,2.  For 
convenience,  we  will  simply  refer  to  pi  as  player  i’s  risk- 
aversion  level,  which  parameterizes  the  player’s  attitudes. 

We  define  each  player’s  rejectability  function  as 


PRi  (ui) 


Pi,  for  Ui  =  s 
1  —  Pi,  for  u i  =  h  ’ 


(8) 


an  expression  of  normalized  opportunity  cost  for  each  action. 
The  cost  of  hunting  stag  is  the  relative  hare  hunting  utility, 
and  vice  versa.  Note  that  the  players’  rejecting  selves  are  not 
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dependent  on  others’  preferences,  allowing  us  to  define  the 
marginal  utilities  directly. 


We  next  define  the  conditional  distribution  for  9S.  The 
distribution  of  this  random  variable,  which  is  conditioned  upon 
both  players’  rejecting  selves,  represents  the  probability  that 
the  players  will  successfully  hunt  stag.  The  distribution  of  9S 
incorporates  whether  or  not  Ri  and  R2  reject  cooperation  and 
how  likely  the  players  are  to  catch  a  stag  if  they  cooperate. 
We  model  the  latter  consideration  by  defining  0  <  a  <  1, 
which  represents  the  probability  of  catching  a  stag  given  that 
the  players  cooperate.  It  may  reflect  the  number  of  stag  in 
the  environment,  the  players’  hunting  skills,  or  other  external 
factors.  If  Ri  and  R2  altogether  reject  hare  hunting,  then  the 
players  will  cooperate  and  successfully  capture  a  stag  with 
probability  a.  We  characterize  this  by  defining 


Pes\R1R2  (&s\h,  h) 


a,  for  ds  =  1 
1  —  a,  for  ds  =  0  ’ 


(9) 


where  9S  represents  the  random  variable  and  ds  represents  its 
realization.  If,  however,  either  player  unilaterally  rejects  stag 
hunting,  the  probability  of  catching  a  stag  is  zero,  yielding 


=Pe  S  |i?l  i?2  {ds\s,h) 

=  Pes\RlR2(l)s\h,s) 

(  0,  for  ds  =  1 
~  1  1,  for  ds  =  0 


(10) 

(11) 

(12) 


Notice  that  the  players’  preferences  influence  the  probability 
of  a  random  event  as  discussed  in  Section  II-B.  Since  the 
players’  rejecting  preferences  affect  their  willingness  to  hunt 
stag,  the  conditional  structure  is  justifiable. 


We  compute  the  marginal  mass  function  by  summing  over 
the  conditional  random  variables,  yielding 

=  X  PeB  | Rl  ,r2  {d3  | Vx ,  V2 )pRl  ( V i  )pR2  (v2)  (13) 

Vi,V2 

fo-( !-Mi)(1-M2),  for  ds  =  1 

\l  —  a-(l  —  A*i)(l  —  A*2),  for  t?s  =  0 

From  (14)  we  see  that  as  the  risk-aversion  levels  decrease, 
the  probability  of  a  successful  stag  hunt  increases.  If  both 
players  are  completely  payoff-seeking  (hi  =  Hz  =  0),  the 
probability  of  a  successful  stag  hunt  is  a.  Either  player  can 
reduce  the  chances  for  a  successful  hunt.  As  the  risk-aversion 
Hi  increases  for  either  player,  the  probability  of  a  successful 
stag  hunt  decreases. 


selectability  function  is 


PSiies(wi|t?s)  = 


1  for  m  =  s|$s  =  1 
0  for  u,  =  h\ds  =  1 
0  for  m  =  s|$s  =  0 
1  for  m  =  hid 8  =  0 


(15) 


The  simple  form  of  the  conditionals  allows  us  to  express  the 
marginal  selectability  as 


PSi(Ui) 


0-(l-/ri)(l-/z2) 

1  -  <r(l  -  Mi)(l  -  M2) 


for  Ui  =  s 
for  m  =  h 


(16) 


A.  The  Satisficing  Rectangle 

With  all  of  the  social  utilities  defined,  we  have  completely 
characterized  the  players’  utilities  and  can  solve  for  the 
pure-strategy  profiles  that  form  the  satisficing  rectangle.  As 
discussed  in  Section  II,  the  satisficing  rectangle  is  the  set 
of  pure-strategy  profiles  that  are  simultaneously  satisficing 
to  each  player  individually.  In  Figure  6,  we  set  q  =  1  and 
plot  the  regions  of  the  satisficing  rectangle  as  functions  of 
Hi  and  H2,  which  specify  the  players’  attitudes.  There  are 
four  possibilities.  When  both  players  have  low  risk-aversion, 
(s,  s)  is  the  unique  strategy  profile  in  the  satisficing  rectangle. 
If  risk-aversion  is  high  in  both  players,  (h,  h)  results.  In  the 
(h,  s )  and  (s,  h)  regions,  however,  one  player  is  strongly  risk- 
averse  while  the  other  strongly  seeks  payoff,  resulting  in  one 
player  that  tries  to  cooperate  while  the  other  does  not.  On 
the  boundaries  of  the  four  regions,  the  satisficing  rectangle 
contains  multiple  strategy  profiles. 


Finally,  we  define  the  conditional  selectability.  Each  player’s 
selectability  is  influenced  by  the  probability  of  a  successful 
stag  hunt.  The  selectability,  as  discussed  earlier,  is  tied  to  the 
benefits  of  cooperation:  to  the  extent  that  a  successful  stag 
hunt  is  possible  ( 9  =  1),  selectability  favors  stag  hunting.  The 
higher  the  probability  of  successful  stag  hunting,  the  more 
beneficial  it  is  to  hunt  stag.  The  corresponding  conditional 


Fig.  6.  Satisficing  rectangle  regions  for  the  Stag  Hunt. 

These  last  two  regions  illustrate  a  unique  feature  of  satisfic¬ 
ing  models.  In  the  (h,  s )  and  (h,  s )  regions,  one  player  chooses 
to  hunt  hare  while  the  other  player,  who  is  aware  of  the  first 
player’s  increased  risk-aversion,  nevertheless  stands  by  its  post 
and  attempts  to  hunt  stag.  Such  dysfunctional  behavior  is  a 
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consequence  of  the  structure  of  the  utilities:  players’  utilities 
depend  on  the  others’  attitudes  rather  than  the  strategies  they 
play. 

We  hasten  to  note  that  dysfunctional  behavior  is  not  a  failure 
per  se  of  the  satisficing  model.  Dysfunctional  societies  do 
exist  in  practice,  and  we  may  interpret  these  regions  as  an 
acknowledgement  that  players  with  incompatible  attitudes  may 
act  incoherently.  However,  in  designing  artificial  systems,  we 
typically  prefer  to  avoid  incoherent  behaviors,  sociologically 
justifiable  or  not.  It  seems  unreasonable  that  incompatible 
players  would  continue  to  exhibit  the  same  attitudes  and  to 
enact  the  same  incoherent  strategies.  Thus,  we  introduce  the 
attitude  dynamics,  which  provides  a  way  for  players  to  adapt 
their  attitudes  and  avoid  such  dysfunctional  behavior. 

V.  Attitude  Dynamics 

To  introduce  the  attitude  equilibrium  and  the  attitude  dy¬ 
namics,  we  first  embellish  the  structure  of  the  satisficing  game. 
We  endow  each  player  with  a  classical  utility  function  which  is 
based  solely  on  the  strategy  profile  that  the  players  implement. 

Definition  1:  An  augmented  satisficing  game  is  a  5-tuple 
(X,  U,pSl...SnHl...Bn ,  A,  7r(u)).  The  first  three  elements  are 
the  set  of  players,  the  pure-strategy  space,  and  interdependence 
function  as  normal.  Additionally,  we  introduce  the  pure- 
attitude  space  A=  A\  x  A2  x  ■  ■  ■  x  An  containing  the  attitudes 
that  the  players  may  exhibit.  These  attitudes  are  parameters 
in  the  players’  social  utilities,  and  are  different  for  each 
satisficing  game.  We  also  introduce  7r(u),  a  vector  payoff 
function  which  describes  the  raw  payoff  to  the  players  for 
implementing  the  pure-strategy  profile  u  €  U. 

To  augment  a  satisficing  game,  the  players’  attitudes  must  be 
specified  as  distinct  parameters  in  the  players’  social  utilities. 
Further,  we  must  be  able  to  construct  a  raw  payoff  function  that 
is  separate  from  the  social  utilities.  Constructing  raw  payoff 
functions  may  be  difficult  in  practice.  In  a  system  of  artificial 
agents,  for  example,  the  agents’  objectives  may  be  sufficiently 
complicated  that  it  is  impossible  to  define  a  simple  payoff 
function  for  each  agent.  In  a  simple  game  like  the  Stag  Hunt, 
the  extension  is  straightforward.  Each  player’s  attitudes  are 
given  by  the  risk-aversion  level  /i>,  yielding  a  pure-attitude 
space  of  A  =  [0, 1/2)  x  [0, 1/2).  The  payoff  function  7r(u) 
is  described  by  the  payoff  matrix  in  Table  I. 

The  augmented  satisficing  game  describes  a  two-step  map¬ 
ping  from  attitudes  to  payoffs.  The  social  utilities — determined 
by  the  interdependence  function — map  the  players’  attitudes 
to  pure-strategy  profiles.6  The  payoff  function  then  maps  the 
pure-strategy  profile  to  raw  payoffs.  Thus,  in  an  augmented 
satisficing  game,  we  may  evaluate  the  raw  utility  of  exhibiting 
a  particular  attitude.  To  simplify  notation,  we  will  occasionally 
refer  to  7r(a),  the  payoff  to  the  players  for  implementing  the 
pure-strategy  profile  determined  by  the  pure-attitude  profile 
a  €  A.  That  is,  we  may  think  of  an  augmented  satisficing 
game  as  a  non-cooperative  game  where  players’  payoffs  are 

6We  have  glossed  over  the  fact  that,  in  general,  the  satisficing  rectangle 
contains  multiple  pure-strategy  profiles.  For  the  Stag  Hunt,  this  presents  no 
problem  because  the  satisficing  rectangle  contains  a  single  strategy  profile 
almost  everywhere.  We  will  assume  that,  if  necessary,  the  players  employ  a 
tie-breaking  mechanism  to  select  a  unique  strategy  profile. 


determined  by  the  attitudes  they  exhibit  rather  than  the  strate¬ 
gies  they  play. 

We  may  also  discuss  mixed  attitudes  which  are  probability 
distributions  over  the  attitudes  the  players  exhibit.  Denoting 
the  cardinality  of  Ui  as  ,  the  mixed  attitude  of  player  i 
is  given  by  a  (normalized  and  nonnegative)  fc* -dimensional 
vector  Z;.  The  discussion  of  mixed  strategies  in  Section  III-A 
applies  directly  to  mixed  attitudes.  We  assume  that  players’ 
mixed  attitudes  are  probabilistically  independent  of  each  other. 
We  define  player  i’s  mixed  attitude  simplex  A/.  The  mixed- 
attitude  space  is  the  Cartesian  product  Oa  =  A“  x  Aj  x  •  •  •  x 
A“.  A  mixed-attitude  profile  is  a  vector  of  mixed  attitudes 
z  =  (zi,z2, . . .  ,z„)  e  0a. 

Since  the  players’  mixed  attitudes  are  independent,  the 
probability  that  a  pure-attitude  profile  is  exhibited  is  equal 
to  the  product  of  the  associated  probabilities.  Thus,  player 
i’s  expected  utility  ufi z)  when  the  players  exhibit  the  mixed- 
attitude  profile  z£@“  is: 

n 

Ui( z)  =  ^2  7r,;(a )  JJzi0i,  (17) 

a£A  i=  1 

where  Ziai  is  the  probability  with  which  player  i  exhibits 
the  pure-attitude  a, .  Now,  given  complete  knowledge  of  the 
satisficing  game  and  the  other  players’  utilities,  a  player  may 
consider  changing  their  attitudes  to  increase  expected  utility, 
which  motivates  the  attitude  equilibrium. 

Definition  2:  An  attitude  equilibrium  is  a  mixed-attitude 
profile  z*  e  0“  such  that 

tu(zi,  •  •  • ,  z*, . . . ,  z* )  >  zt,...,z',...,z*)  (18) 

for  each  z '  e  A°  and  for  each  i  £  X. 

The  definition  for  the  attitude  equilibrium  is  essentially  iden¬ 
tical  to  that  of  the  Nash  equilibrium:  no  player  can  improve 
its  expected  utility  by  exhibiting  a  different  mixed  attitude.  In 
fact,  we  may  say  that  an  attitude  equilibrium  is  an  equilibrium 
in  players’  attitudes,  rather  than  in  their  strategies.  Because  of 
the  analogy  between  the  attitude  equilibrium  and  the  Nash 
equilibrium,  many  theoretical  results  apply. 

Theorem  1:  An  attitude  equilibrium  exists  for  every  aug¬ 
mented  satisficing  game  with  finite  attitude  spaces. 

Proof:  This  result  relies  upon  the  fact  that  any  augmented 
satisficing  game  defines  a  classical  non-cooperative  game 
where  X  is  the  set  of  players,  A  takes  the  role  of  the  pure- 
strategy  space  and  7r(a)  is  the  payoff  function.  In  [3],  it 
is  shown  that  any  non-cooperative  game  with  a  finite  pure- 
strategy  space  has  at  least  one  Nash  equilibrium,  although  it 
may  exist  only  in  mixed  strategies.  Since  an  attitude  equi¬ 
librium  is  simply  a  Nash  equilibrium  in  the  players’  attitudes, 
one  must  exist  for  any  augmented  satisficing  game  with  a  finite 
pure-attitude  space,  even  if  it  exists  only  in  mixed  attitudes. 

■ 

Note  that  a  finite  attitude  space  is  a  sufficient,  but  not  nec¬ 
essary,  condition  for  the  existence  of  an  attitude  equilibrium. 
Indeed,  for  the  Stag  Hunt,  even  though  the  attitude  spaces 
are  continuous,  it  is  immediate  that  attitude  equilibria  exist  in 
pure  attitudes.  In  Figure  7,  the  attitude  equilibria  are  shown 
for  several  values  of  <7.  If  the  players’  pure-attitude  profile 
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lies  in  these  regions,  there  is  no  incentive  for  either  player  to 
change  attitudes. 


Mi 


Fig.  7.  Attitude  equilibrium  regions  for  the  Stag  Hunt. 

Consider  the  (s,  s)  region  of  the  satisficing  rectangle.  Here, 
both  players  receive  maximum  payoff  and  there  is  no  incentive 
for  either  player  to  deviate.  Notice,  however,  that  only  part 
of  the  ( h,h )  region  is  an  equilibrium.  This  is  because,  when 
player  i’s  risk-aversion  //,,  is  sufficiently  low,  it  is  possible 
for  player  j  to  move  the  group  from  mutual  hare-hunting  to 
stag-hunting  by  lowering  its  own  /j; .  Even  though  (h,h)  is 
an  equilibrium  under  the  classical  game,  the  satisficing  model 
gives  the  players  greater  influence  over  each  other’s  behavior, 
increasing  the  possibility  for  cooperation.  As  a  increases,  the 
size  of  the  (h,  h)  equilibrium  decreases,  disappearing  entirely 
when  (7  =  1. 

Finally,  notice  that  the  dysfunctional  regions  (s,  h)  and 
(h,s)  do  not  contain  equilibria.  In  these  regions,  each  player 
can  improve  its  payoff  by  changing  /j,  and  forcing  the  game 
into  either  (s,  s )  or  (h,  h).  The  attitude  equilibrium  concept 
provides  a  useful  juxtaposition  of  satisficing  theory  and  indi¬ 
vidual  rationality:  the  social  structure  of  the  satisficing  model 
decreases  the  attraction  of  mutual  hare-hunting,  while  the 
introduction  of  the  classical  payoff  function  gives  incentive 
for  players  to  adapt  their  attitudes  and  avoid  dysfunctional 
behaviors  of  the  (s,  h)  and  (h,  s )  regions. 

If  a  large  population  of  players  adapts  by  trial-and-error 
experimentation,  we  can  model  the  evolution  of  the  players’ 
attitudes  by  a  straightforward  application  of  the  standard  repli¬ 
cator  dynamics.  We  again  restrict  our  attention  to  symmetric, 
two-player  games.  Thus,  both  players  are  described  by  the 
pure-attitude  set  A  and  the  payoff  function  7r(a).  We  require 
that  A  be  finite,  and  we  denote  the  cardinality  of  A  as  m. 
Define  a  normalized  vector  z (t)  =  {z\ (t ) ,  z2 (t) ,  ■  ■  ■  ,zm(t)), 
where  z-ft)  represents  the  population  share  exhibiting  the  ith 
pure  attitude.  Just  as  with  traditional  games,  we  may  describe 
the  dynamics  of  the  population  shares  by  a  system  of  m 


differential  equations: 

Zi(t)  =  z (t))  -  7r(z(f),  z (t))\  Zi(t).  (19) 

By  analogy  with  the  standard  formulation,  7r(i,z(f))  is  the 
expected  payoff  for  exhibiting  the  ?'th  attitude  against  a 
random  sample  from  the  population  and  7r(z(f), z(f))  = 
J2i  is  the  average  expected  payoff. 

Let  A  4  be  the  mixed-attitude  simplex  of  A.  Just  as  with 
mixed  strategies,  the  interior  of  A  a  is  the  set  of  all  mixed 
attitudes  which  gives  nonzero  probability  to  each  pure  attitude. 

Theorem  2:  Let  £(f,  z(0))  denote  the  solution  for  the  atti¬ 
tude  dynamics  in  (19)  at  time  t.  with  initial  conditions  z(0). 
If  z(0)  €  hu/A^)  and  lim^^  £(i,  z(0))  =  z*,  then  z*  is  an 
attitude  equilibrium. 

Proof:  This  result  follows  directly  from  the  fact  that 
an  augmented  satisficing  game  can  be  thought  of  as  a  clas¬ 
sical  game  where  players  choose  attitudes  rather  than  play 
strategies.  As  mentioned  in  Section  III- A,  it  is  shown  in  [32] 
that,  when  initialized  with  a  mixed  strategy  on  the  interior  of 
the  mixed-strategy  simplex,  any  steady  state  of  the  replicator 
dynamics  is  a  Nash  equilibrium.  Since  an  attitude  equilibrium 
is  a  Nash  equilibrium  in  players’  attitudes,  the  result  holds  for 
the  attitude  dynamics.  ■ 

Note  that  Theorem  2  does  not  guarantee  that  a  steady-state  will 
occur,  even  under  well-behaved  initial  conditions.  Rather,  if  a 
steady-state  results  under  suitable  initial  conditions,  it  must  be 
an  attitude  equilibrium. 

VI.  Results 

A.  Attitude  Dynamics 

To  apply  the  attitude  dynamics,  we  first  quantize  the  values 
that  [i  may  assume.  Define  A  =  {v\ .  i/2,  ■  ■  . ,  z'ioo}.  a  set  of 
100  evenly-spaced  values  of  /1  over  [0, 1/2).  We  initialize  the 
population  shares  z  according  to  an  exponential  distribution 
so  that  most  players  hunt  hare,  or  2j(0)  cx  e~x^~Ui\  As  we 
set  A  higher,  the  initial  population  is  more  risk-averse  and  less 
willing  to  hunt  stag. 

We  use  the  payoff  matrix  in  Table  I  to  determine  the 
raw  payoff  for  exhibiting  a  particular  pure-attitude  profile 
a  =  (fii,fi2)  G  A  x  A.  If  a  is  in  the  ( h,h )  region  of 
the  satisficing  rectangle  (see  Figure  6),  then  the  payoff  to 
the  first  player  is  7r(/ii,/i2)  =  3.  Similarly,  the  payoffs  are 
=  3  and  7r(^i,/r2)  =  0  if  a  belongs  to  the  (h,s) 
and  ( s,h )  regions,  respectively.  Finally,  n(fii,n2)  =  4ct  if  a 
is  in  the  (s,  s)  region.7 

Because  of  the  high  dimensionality  of  the  state  space  and  the 
complexity  of  the  utility  functions  of  the  players’  preferences, 
it  is  difficult  to  examine  the  attitude  dynamics  analytically.  We 
cannot  easily  solve  for  stationary  points  or  say  much  about  the 
relative  sizes  of  the  basins  of  attraction  as  we  could  under  the 
(much  simpler)  standard  replicator  dynamics.  Fortunately,  we 
can  specify  meaningful  initial  conditions  and  numerically  ap¬ 
proximate  the  solution  to  the  system  of  differential  equations. 
We  examine  several  scenarios  where  the  vast  majority  of  the 

7We  multiply  by  a  in  the  payoff  to  account  for  the  probability  that  the 
players  succeed  given  that  they  both  hunt  stag. 
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population  hunts  hare  and  discuss  when  it  is  possible  to  evolve 
a  cooperative  community. 

First,  we  examine  the  dynamics  with  a  =  1.  We  initialize 
the  population  with  A  =  10,  leaving  over  85%  of  the 
population  hunting  hare.  Figure  8(a)  shows  the  initial  joint 
probability  mass  function  of  the  players  plotted  along  with 
the  four  regions  of  the  satisficing  rectangle.  The  vertical  axis 
shows  the  joint  probability  that  a  pair  of  players — randomly 
selected  from  the  population — will  end  up  at  a  particular  point. 
Since  the  players  are  drawn  randomly  and  independently  from 
the  infinite  population,  the  joint  probability  is  the  product 
of  the  marginal  probabilities  given  by  z.  That  is,  Pr(pi  = 
-  Vj)  =  Zi(t)Zj(t). 

Initially,  almost  all  of  the  joint  probability  mass  is  in  the 
mutual  hare-hunting  region.  The  dynamics,  however,  quickly 
pushes  the  population  towards  stag-hunting.  Within  thirty 
iterations,  almost  the  entire  population  is  in  the  mutual  stag¬ 
hunting  region,  the  most  common  values  of  (/i,,  p,j)  close 
to  zero  (Figure  8(b)).  This  is  due  to  the  fact  that  mutual 
cooperation  is  the  only  attitude  equilibrium  when  <7=1.  For 
any  positive,  finite  A,  all  steady-state  population  distributions 
will  be  entirely  within  the  (s,  s)  region. 

Next,  we  lower  a  to  see  how  the  dynamics  changes. 
Keeping  the  initial  conditions  the  same,  we  let  a  =  0.925, 
introducing  the  (h,h)  attitude  equilibrium  region.  Now,  over 
90%  of  the  initial  population  hunts  hare.  This  scenario  yields  a 
highly  interesting  result.  The  hare  hunting  equilibrium  initially 
dominates  and  the  population  shares  associated  with  the  stag 
hunting  regions  quickly  diminish  (Figure  9(a)).  We  notice, 
however,  that  there  are  small  migrations  toward  the  boundaries 
of  the  decision  regions.  These  players  still  predominantly  hunt 
hare,  but  they  are  less  risk-averse.  As  evolution  continues,  a 
small  concentration  of  players  emerges  around  the  boundaries 
of  the  four  regions,  as  illustrated  in  Figure  9(b).  Players  in 
this  region  are  quite  versatile:  they  hunt  hare  with  risk-averse 
players,  hunt  stag  with  the  payoff-seekers,  and  only  very  rarely 
will  they  end  up  hunting  stag  with  a  player  who  refuses 
to  cooperate.  The  concentration  of  players  slowly  begins  to 
dominate,  causing  more  and  more  players  to  hunt  stag.  Figure 
9(c)  shows  the  population  at  t  =  100.  By  this  time,  essentially 
all  of  the  population  is  composed  of  moderately  risk-averse 
but  versatile  players.  This  truly  emergent  result  provides  an 
interesting  insight  in  defining  “fitness”  in  a  social  system.  In  an 
uncertain  scenario  where  both  hare-hunting  and  stag-hunting 
are  potentially  dominant  strategies,  the  most  successful  players 
are  those  who  are  flexible — those  who  can  adapt  their  actions 
to  the  preferences  of  those  around  them. 

If  we  lower  a  much  below  0.925,  the  dynamics  fails 
to  evolve  the  society  toward  cooperation  for  these  initial 
conditions.  This  happens  for  two  reasons:  (1)  the  size  of  the 
(s,  s)  region  becomes  smaller  with  decreasing  a ,  and  (2)  the 
expected  payoff  for  exhibiting  attitudes  in  the  (s,  s)  region 
decreases.  However,  even  under  the  unfavorable  conditions 
shown  where  a  pair  of  stag  hunters  might  fail,  the  satisficing 
model  can  evolve  cooperation  from  noncooperation.  Fewer 
than  10%  of  the  initial  population  are  required  to  hunt  stag 
in  the  satisficing  model,  a  significant  improvement  over  the 
standard  replicator  model,  where  over  75%  must  initially  hunt 
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Fig.  8.  Joint  attitude  distribution  for  cr  =  1,  A  =  10. 
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B.  Spatial  Evolutionary  Models 

For  comparison,  we  also  consider  the  Stag  Hunt  under  the 
spatial  evolutionary  models  discussed  in  [15, 16, 28-30],  which 
have  been  proven  effective  in  promoting  cooperation  in  social 
dilemmas.  In  [15],  the  Stag  Hunt  is  specifically  studied  in 
terms  of  the  relative  benefit  for  mutual  stag  hunting.  Here, 
we  examine  the  question  in  terms  of  initial  population:  what 
fraction  of  the  population  must  initially  hunt  stag  in  order  for 
cooperation  to  flourish? 

Spatial  evolutionary  models  are  described  by  undirected 
graphs,  where  each  vertex  represents  a  player,  and  each  edge 
represents  a  social  link  between  two  players.  As  with  the 
replicator  dynamics,  each  player  is  pre-programmed  to  play  a 
particular  pure  strategy.  But,  in  the  spatial  dynamics,  a  player 
may  change  strategies  depending  on  the  relative  fitness  of 
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its  neighbors.  At  each  generation,  players  accrue  payoff  by 
playing  a  single  instance  of  the  game  with  each  neighbor.  After 
play,  each  player  randomly  selects  a  neighbor  (possibly  itself) 
with  a  probability  proportional  to  the  payoff  accrued  in  the 
current  previous  round,  adopting  that  player’s  pure-strategy 
for  the  next  round. 

We  may  interpret  the  spatial  dynamics  as  an  imitation  dy¬ 
namics,  where  a  player  imitates  the  behavior  of  its  neighbors, 
or  as  a  death-birth  dynamics,  where  players  “die”  and  give 
rise  to  a  new  generation  whose  strategies  depend  on  the 
neighbors’  relative  fitness.  Regardless  of  interpretation,  for 
fully  connected  graphs,  the  dynamics  converges  to  the  standard 
replicator  dynamics  as  the  population  size  becomes  large  and 
the  time  between  generations  becomes  small. 

In  the  Stag  Hunt,  let  Ns(i)  and  Nh{i)  be  the  set  of  player 
i’s  neighbors  (including  itself)  that  hunt  stag  and  hare,  respec¬ 
tively,  and  let  P(i)  denote  the  payoff  earned  by  player  i  during 
a  single  generation.  Thus,  letting  |-|  denote  the  cardinality  of 
a  set,  player  i  earns  F(i)  =  4(|Ars(i)|  —  1)  if  it  hunts  stag, 
and  F(i)  =  3(|iVs(*)|  +  \Nh(i)\  —  1)  if  it  hunts  hare.8  Next, 

define  Fa(i)  =  E jeN.(i)FU)  and  Fh{i)  =  T,jeNh(i)  FU\ 

the  respective  sum  payoff  of  stag-  and  hare-hunting  neigh¬ 
bors.  Finally,  since  a  neighbor  is  selected  with  a  probability 
proportional  to  its  fitness,  player  i  hunts  stag  during  the  next 
generation  with  probability  Fs(i) / (Fs(i)  +  Fh(i)). 

The  spatial  dynamics  is  highly  dependent  on  the  structure 
of  the  graph  used  to  model  the  population.  We  construct  our 
graphs  according  to  so-called  “scale-free”  models  [36],  in 
which  the  number  of  neighbors  follows  a  power-law  distri¬ 
bution.  If  Kj  is  the  random  variable  describing  the  number 
of  neighbors  for  player  i,  then  each  K,  is  identically  and 
independently  distributed  according  to  pK.{k)  ex  k1  for  some 
constant  7.  This  distribution  describes  a  heterogeneous,  and 
realistic,  model  of  social  connectivity:  many  players  have  only 
a  few  neighbors,  while  a  few  players  are  heavily  connected  to 
the  rest  of  the  population.  Scale-free  models  have  been  shown 
to  improve  the  possibility  of  cooperation  in  social  dilemmas 

[15]. 

To  evaluate  the  performance  of  the  spatial  dynamics,  we 
construct  graphs  with  50  players,  an  average  number  of 
connections  per  player  z  =  E(K),  and  an  initial  fraction  of 
the  population  xs(0)  hunting  stag.  For  each  (a:s(0),  z)  pair,  we 
construct  ten  graphs,  each  of  which  is  seeded  with  ten  initial 
populations.  After  running  the  dynamics  for  5000  generations, 
we  record  the  steady  state  behavior  by  averaging  the  fraction 
of  stag  hunters  over  an  additional  500  generations.  Figure  10 
shows  the  average  results  of  our  trials.  For  moderately  low 
values  of  z,  the  spatial  dynamics  considerably  improves  the 
possibility  for  cooperation:  a  sizeable  fraction  of  the  steady- 
state  population  hunts  stag  even  when  only  a  quarter  of  the 
initial  population  cooperate.  This  result  is  consistent  with 
previous  studies  of  cooperation  in  spatial  networks  [15, 16]. 
When  the  average  number  of  connections  is  small,  cooperation 
emerges  more  readily.  However,  in  contrast  to  the  attitude 
dynamics,  stag  hunting  does  not  consistently  dominate  the 

8The  (—1)  term  in  each  payoff  accounts  for  the  fact  that,  although  Ns(i) 
or  Nh(i)  includes  player  i,  the  player  does  not  pair  with  itself  during  play. 


population  unless  a  solid  majority  of  players  initially  coop¬ 
erate. 

VII.  Conclusion 

In  this  paper,  we  have  extended  the  theory  of  satisficing 
games  by  incorporating  elements  from  non-cooperative  game 
theory.  We  augment  the  satisficing  game  with  a  standard  utility 
function  that  gives  the  raw  payoff  to  a  player  for  exhibiting 
particular  attitudes.  The  augmented  framework  results  in  an 
attitude  equilibrium  in  which  no  single  player  can  improve 
its  raw  payoff  by  exhibiting  different  attitudes.  The  attitude 
equlibrium  combines  the  merits  of  both  satisficing  and  non- 
cooperative  game  theory.  The  conditional  utility  structure 
allows  players  to  consider  others’  preferences  in  making 
decisions,  and  the  standard  payoff  function  allows  players  to 
adapt  their  attitudes  to  avoid  dysfunctional  behavior. 

The  non-cooperative  elements  of  augmented  satisficing 
games  allow  us  to  employ  evolutionary  game  theory,  where 
adaptation  occurs  by  trial-and-error.  We  define  an  attitude 
dynamics  by  applying  the  standard  replicator  dynamics  to  the 
attitudes  exhibited  by  the  players,  rather  than  the  strategies 
play.  The  attitude  dynamics  models  the  evolution  of  players’ 
attitudes  according  to  the  game  and  the  attitudes  of  other 
players.  Given  appropriate  initial  conditions,  the  steady  state 
of  the  dynamics  is  an  attitude  equilibrium. 

We  have  presented  a  satisficing  model  for  the  Stag  Hunt, 
a  game  under  which  it  is  difficult  to  evolve  a  cooperative 
population.  Under  the  augmented  satisficing  framework,  dys¬ 
functional  behaviors  vanish:  the  attitude  equilibria  lie  entirely 
within  the  regions  where  players  either  mutually  hunt  stag  or 
mutually  hunt  hare.  Also,  the  attitude  dynamics  facilitates  the 
evolution  of  cooperation  by  introducing  strategic  complexity 
into  the  dynamics.  Instead  of  simply  choosing  whether  or  not 
to  hunt  stag,  a  player  chooses  a  risk-aversion  level,  which 
governs  its  interaction  with  the  rest  of  the  population.  Under 
a  wide  variety  of  circumstances,  the  dynamics  encourages  the 
population  to  become  less  risk  averse,  allowing  cooperation  to 
flourish.  Our  results  significantly  outperform  other  evolution¬ 
ary  methods,  including  classic  replicator  models  and  recently- 
proposed  spatial  evolutionary  models. 

Finally,  the  theoretical  properties  that  borrow  from  non- 
cooperative  game  theory  suggest  that  our  results  will  general¬ 
ize  to  large  classes  of  games.  Specifically,  any  game  with  finite 
attitude  spaces  must  have  an  attitude  equilibrium,  and  any 
(properly  initialized)  steady  state  of  the  attitude  dynamics  is 
an  attitude  equilibrium.  While  we  cannot,  of  course,  guarantee 
any  specific  results,  we  may  expect  that  the  qualitative  benefits 
of  our  approach  will  pertain  to  other  games. 
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(c)  t  =  100 


Fig.  9.  Joint  attitude  distribution  for  a  =  0.925,  A  =  10. 
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Fig.  10.  Average  steady-state  stag-hunting  fraction  under  spatial  evolutionary 
dynamics. 


